Sampling-based Fast Gradient Rescaling Method for Highly Transferable Adversarial Attacks

Xu Han; Anmin Liu; Chenxuan Yao; Yanbo Fan; Kun He

転送性の高い敵対的攻撃のためのサンプリングベースの高速勾配再スケーリング手法

ディープニューラルネットワークは、人間が知覚できない摂動を無害な入力に追加することによって作成された敵対的な例に対して脆弱であることが知られています。ホワイトボックス設定でほぼ 100% の攻撃成功率を達成した後、ブラックボックス攻撃に重点が移され、その中で敵対的な例の転送可能性が大きな注目を集めています。いずれの場合も、一般的な勾配ベースの方法では、一般に符号関数を使用して勾配更新で摂動を生成します。これにより、おおよそ正しい方向が得られ、大きな成功を収めています。しかし、その可能性のある限界に注意を払っている研究はほとんどありません。この研究では、元の勾配と生成されたノイズの間の偏差により、勾配更新の推定が不正確になり、敵対的伝達性の次善の解決策が得られる可能性があることが観察されました。この目的を達成するために、我々はサンプリングベースの高速勾配再スケーリング法 (S-FGRM) を提案します。具体的には、データの再スケーリングを使用して、追加の計算コストをかけずに符号関数を置き換えます。さらに、再スケーリングの変動を排除し、勾配更新を安定させるために、深さ優先サンプリング法を提案します。私たちの方法はあらゆる勾配ベースの攻撃に使用でき、さまざまな入力変換またはアンサンブル方法と統合して敵対的な転送可能性をさらに向上させる拡張性があります。標準の ImageNet データセットに対する広範な実験により、私たちの手法が勾配ベースの攻撃の転送可能性を大幅に向上させ、最先端のベースラインを上回るパフォーマンスを発揮できることが示されました。

Deep neural networks are known to be vulnerable to adversarial examples crafted by adding human-imperceptible perturbations to the benign input. After achieving nearly 100% attack success rates in white-box setting, more focus is shifted to black-box attacks, of which the transferability of adversarial examples has gained significant attention. In either case, the common gradient-based methods generally use the sign function to generate perturbations on the gradient update, that offers a roughly correct direction and has gained great success. But little work pays attention to its possible limitation. In this work, we observe that the deviation between the original gradient and the generated noise may lead to inaccurate gradient update estimation and suboptimal solutions for adversarial transferability. To this end, we propose a Sampling-based Fast Gradient Rescaling Method (S-FGRM). Specifically, we use data rescaling to substitute the sign function without extra computational cost. We further propose a Depth First Sampling method to eliminate the fluctuation of rescaling and stabilize the gradient update. Our method could be used in any gradient-based attacks and is extensible to be integrated with various input transformation or ensemble methods to further improve the adversarial transferability. Extensive experiments on the standard ImageNet dataset show that our method could significantly boost the transferability of gradient-based attacks and outperform the state-of-the-art baselines.

updated: Thu Jul 06 2023 07:52:42 GMT+0000 (UTC)

published: Thu Jul 06 2023 07:52:42 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト