Local Black-box Adversarial Attacks: A Query Efficient Approach

Tao Xiang; Hangcheng Liu; Shangwei Guo; Tianwei Zhang; Xiaofeng Liao

ローカルブラックボックスの敵対的攻撃：クエリ効率の高いアプローチ

敵対的攻撃は、セキュリティに敏感なシナリオでのディープニューラルネットワークの適用を脅かしています。ほとんどの既存のブラックボックス攻撃は、ターゲットモデルと何度も相互作用し、グローバルな摂動を生み出すことによって、ターゲットモデルをだまします。ただし、グローバルな摂動は滑らかで重要でない背景を変更します。これにより、摂動が認識されやすくなるだけでなく、クエリのオーバーヘッドも増加します。この論文では、ブラックボックス攻撃の限られたクエリ内でのみクリーンな例の識別領域を混乱させるための新しいフレームワークを提案します。私たちのフレームワークは、2種類の転送可能性に基づいて構築されています。 1つ目は、モデル解釈の転送可能性です。この特性に基づいて、局所的な摂動について、与えられたクリーンな例の識別領域を簡単に識別します。 2つ目は、敵対的な例の移転可能性です。これは、クエリの効率を向上させるためのローカルな事前摂動を生成するのに役立ちます。識別領域と事前摂動を特定した後、2種類のブラックボックス攻撃手法、つまり勾配推定とランダム検索を使用してターゲットモデルにクエリを実行することにより、事前摂動の例から最終的な敵対的な例を生成します。私たちのフレームワークが高い攻撃成功率でブラックボックス摂動中のクエリ効率を大幅に改善できることを示すために、広範な実験を実施します。実験結果は、私たちの攻撃がさまざまなシステム設定の下で最先端のブラックボックス攻撃よりも優れていることを示しています。

Adversarial attacks have threatened the application of deep neural networks in security-sensitive scenarios. Most existing black-box attacks fool the target model by interacting with it many times and producing global perturbations. However, global perturbations change the smooth and insignificant background, which not only makes the perturbation more easily be perceived but also increases the query overhead. In this paper, we propose a novel framework to perturb the discriminative areas of clean examples only within limited queries in black-box attacks. Our framework is constructed based on two types of transferability. The first one is the transferability of model interpretations. Based on this property, we identify the discriminative areas of a given clean example easily for local perturbations. The second is the transferability of adversarial examples. It helps us to produce a local pre-perturbation for improving query efficiency. After identifying the discriminative areas and pre-perturbing, we generate the final adversarial examples from the pre-perturbed example by querying the targeted model with two kinds of black-box attack techniques, i.e., gradient estimation and random search. We conduct extensive experiments to show that our framework can significantly improve the query efficiency during black-box perturbing with a high attack success rate. Experimental results show that our attacks outperform state-of-the-art black-box attacks under various system settings.

updated: Mon Jan 04 2021 15:32:16 GMT+0000 (UTC)

published: Mon Jan 04 2021 15:32:16 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト