GNP Attack: Transferable Adversarial Examples via Gradient Norm Penalty

Tao Wu; Tie Luo; Donald C. Wunsch

GNP 攻撃: 勾配ノルムペナルティによる転送可能な敵対的な例

良好な転送性を備えた敵対的例 (AE) により、ターゲットモデルに関する内部知識を必要としない、さまざまなターゲットモデルに対する実用的なブラックボックス攻撃が可能になります。以前の方法では、移行性がまったくないか、非常に限定された AE が生成されることがよくありました。つまり、ソースのホワイトボックスモデルと生成された AE は、ターゲットのブラックボックスモデルではほとんど機能しません。この論文では、Gradient Norm Penalty (GNP) を使用して AE 伝達性を強化する新しいアプローチを提案します。これは、損失関数の最適化手順を推進して、損失ランドスケープ内の局所最適の平坦な領域に収束させます。 11 の最先端 (SOTA) 深層学習モデルと 6 つの高度な防御手法を攻撃することで、GNP が伝達性の高い AE を生成するのに非常に効果的であることを実証しました。また、より強力な転送ベースの攻撃のために他の勾配ベースの手法と簡単に統合できるという点で、非常に柔軟であることも示します。

Adversarial examples (AE) with good transferability enable practical black-box attacks on diverse target models, where insider knowledge about the target models is not required. Previous methods often generate AE with no or very limited transferability; that is, they easily overfit to the particular architecture and feature representation of the source, white-box model and the generated AE barely work for target, black-box models. In this paper, we propose a novel approach to enhance AE transferability using Gradient Norm Penalty (GNP). It drives the loss function optimization procedure to converge to a flat region of local optima in the loss landscape. By attacking 11 state-of-the-art (SOTA) deep learning models and 6 advanced defense methods, we empirically show that GNP is very effective in generating AE with high transferability. We also demonstrate that it is very flexible in that it can be easily integrated with other gradient based methods for stronger transfer-based attacks.

updated: Sun Jul 09 2023 05:21:31 GMT+0000 (UTC)

published: Sun Jul 09 2023 05:21:31 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト