On Generating Transferable Targeted Perturbations

Muzammal Naseer; Salman Khan; Munawar Hayat; Fahad Shahbaz Khan; Fatih Porikli

転送可能なターゲット摂動の生成について

敵対的摂動のターゲットを絞らないブラックボックス転送可能性は以前に広く研究されてきましたが、目に見えないモデルの決定を特定の「ターゲットを絞った」クラスに変更することは依然として挑戦的な偉業です。この論文では、高度に伝達可能なターゲット摂動（\ ours）のための新しい生成アプローチを提案します。既存のメソッドは、モデルごとに変化するクラス境界情報に依存しているため、このタスクにはあまり適していないことに注意してください。これにより、転送可能性が低下します。対照的に、私たちのアプローチは、摂動された画像の「分布」をターゲットクラスの画像と一致させ、高いターゲット転送可能率をもたらします。この目的のために、ソース画像とターゲット画像のグローバルな分布を調整するだけでなく、2つのドメイン間のローカルな近隣構造を一致させる新しい目的関数を提案します。提案された目的に基づいて、与えられた入力に固有の摂動を適応的に合成できる母関数をトレーニングします。私たちの生成的アプローチは、ソースまたはターゲットドメインのラベルに依存しませんが、幅広い攻撃設定で最先端の方法に対して一貫して優れたパフォーマンスを発揮します。例として、ImageNet valで（敵対的に弱い）VGG19_BNから（強い）WideResNetへの32.63％のターゲット転送可能性を達成します。セット。これは、以前の最良の生成攻撃より4倍高く、インスタンス固有の反復攻撃より16倍優れています。コードはhttps://github.com/Muzammal-Naseer/TTPで入手できます。

While the untargeted black-box transferability of adversarial perturbations has been extensively studied before, changing an unseen model's decisions to a specific `targeted' class remains a challenging feat. In this paper, we propose a new generative approach for highly transferable targeted perturbations (\ours). We note that the existing methods are less suitable for this task due to their reliance on class-boundary information that changes from one model to another, thus reducing transferability. In contrast, our approach matches the perturbed image `distribution' with that of the target class, leading to high targeted transferability rates. To this end, we propose a new objective function that not only aligns the global distributions of source and target images, but also matches the local neighbourhood structure between the two domains. Based on the proposed objective, we train a generator function that can adaptively synthesize perturbations specific to a given input. Our generative approach is independent of the source or target domain labels, while consistently performs well against state-of-the-art methods on a wide range of attack settings. As an example, we achieve 32.63% target transferability from (an adversarially weak) VGG19_BN to (a strong) WideResNet on ImageNet val. set, which is 4× higher than the previous best generative attack and 16× better than instance-specific iterative attack. Code is available at: https://github.com/Muzammal-Naseer/TTP.

updated: Fri Aug 13 2021 19:05:41 GMT+0000 (UTC)

published: Fri Mar 26 2021 17:55:28 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト