Improving the Transferability of Targeted Adversarial Examples through Object-Based Diverse Input

Junyoung Byun; Seungju Cho; Myung-Joon Kwon; Hee-Seon Kim; Changick Kim

オブジェクトベースの多様な入力によるターゲットを絞った敵対的な例の転送可能性の改善

敵対的な例の転送可能性により、ブラックボックスモデルの欺瞞が可能になり、転送ベースの標的型攻撃は、その実用的な適用性のために多くの関心を集めています。転送の成功率を最大化するには、敵対的な例でソースモデルへの過剰適合を回避する必要があり、画像の拡張はこのための主要なアプローチの1つです。ただし、以前の作業では、入力の多様性を制限するサイズ変更などの単純な画像変換を利用しています。この制限に対処するために、3Dオブジェクト上に敵対的な画像を描画し、レンダリングされた画像をターゲットクラスとして分類するように誘導する、オブジェクトベースの多様な入力（ODI）メソッドを提案します。私たちの動機は、3Dオブジェクトに印刷された画像に対する人間の優れた知覚に由来します。画像が十分に鮮明であれば、人間はさまざまな表示条件で画像の内容を認識できます。同様に、敵対的な例がモデルのターゲットクラスのように見える場合、モデルは3Dオブジェクトのレンダリングされた画像もターゲットクラスとして分類する必要があります。 ODIメソッドは、複数のソースオブジェクトのアンサンブルを活用し、表示条件をランダム化することにより、入力を効果的に多様化します。 ImageNet-Compatibleデータセットでの実験結果では、この方法により、最先端の方法と比較して、平均的な標的型攻撃の成功率が28.3％から47.0％に向上します。また、顔の検証タスクの敵対的な例へのODIメソッドの適用性と、その優れたパフォーマンスの向上についても説明します。私たちのコードはhttps://github.com/dreamflake/ODIで入手できます。

The transferability of adversarial examples allows the deception on black-box models, and transfer-based targeted attacks have attracted a lot of interest due to their practical applicability. To maximize the transfer success rate, adversarial examples should avoid overfitting to the source model, and image augmentation is one of the primary approaches for this. However, prior works utilize simple image transformations such as resizing, which limits input diversity. To tackle this limitation, we propose the object-based diverse input (ODI) method that draws an adversarial image on a 3D object and induces the rendered image to be classified as the target class. Our motivation comes from the humans' superior perception of an image printed on a 3D object. If the image is clear enough, humans can recognize the image content in a variety of viewing conditions. Likewise, if an adversarial example looks like the target class to the model, the model should also classify the rendered image of the 3D object as the target class. The ODI method effectively diversifies the input by leveraging an ensemble of multiple source objects and randomizing viewing conditions. In our experimental results on the ImageNet-Compatible dataset, this method boosts the average targeted attack success rate from 28.3% to 47.0% compared to the state-of-the-art methods. We also demonstrate the applicability of the ODI method to adversarial examples on the face verification task and its superior performance improvement. Our code is available at https://github.com/dreamflake/ODI.

updated: Thu Mar 17 2022 06:57:14 GMT+0000 (UTC)

published: Thu Mar 17 2022 06:57:14 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト