Boosting Adversarial Attack with Similar Target

Shuo Zhang; Ziruo Wang; Zikai Zhou; Huanran Chen

同様のターゲットによる敵対的攻撃を強化する

ディープニューラルネットワークは敵対的な例に対して脆弱であり、モデルのアプリケーションに脅威を与え、セキュリティ上の懸念を引き起こします。敵対的な例の興味深い特性は、その強力な転移可能性です。転送可能性を高めるために、その有効性が実証されているアンサンブル攻撃など、いくつかの方法が提案されています。ただし、従来のアプローチでは、モデルアンサンブルのロジット、確率、または損失を平均するだけであり、モデルアンサンブルによって転送可能性が大幅に向上する方法と理由についての包括的な分析が欠けています。本稿では、Similar Target~(ST)と呼ばれる同様の標的型攻撃手法を提案します。各モデルの勾配間のコサイン類似性を促進することにより、私たちの方法は最適化の方向を正規化し、すべての代理モデルを同時に攻撃します。この戦略は汎化能力を高めることが証明されています。 ImageNet での実験結果は、敵対的転送可能性の向上における私たちのアプローチの有効性を検証しています。私たちの手法は、18 の識別分類器と敵対的に訓練されたモデルにおいて、最先端の攻撃者よりも優れたパフォーマンスを発揮します。

Deep neural networks are vulnerable to adversarial examples, posing a threat to the models' applications and raising security concerns. An intriguing property of adversarial examples is their strong transferability. Several methods have been proposed to enhance transferability, including ensemble attacks which have demonstrated their efficacy. However, prior approaches simply average logits, probabilities, or losses for model ensembling, lacking a comprehensive analysis of how and why model ensembling significantly improves transferability. In this paper, we propose a similar targeted attack method named Similar Target~(ST). By promoting cosine similarity between the gradients of each model, our method regularizes the optimization direction to simultaneously attack all surrogate models. This strategy has been proven to enhance generalization ability. Experimental results on ImageNet validate the effectiveness of our approach in improving adversarial transferability. Our method outperforms state-of-the-art attackers on 18 discriminative classifiers and adversarially trained models.

updated: Mon Aug 21 2023 14:16:36 GMT+0000 (UTC)

published: Mon Aug 21 2023 14:16:36 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト