Patch-wise++ Perturbation for Adversarial Targeted Attacks

Lianli Gao; Qilong Zhang; Jingkuan Song; Heng Tao Shen

敵対的標的型攻撃に対するパッチごとの++摂動

ディープニューラルネットワーク (DNN) の敵対的攻撃は大きな進歩を遂げていますが、特に標的型攻撃の場合、その転送可能性は依然として不十分です。その背後には長い間見過ごされてきた 2 つの問題があります。1) <0xCF><0xCF> iterationり制約に準拠するために、ステップサイズが <0xCF><0xCF><0xB5>/T である T 反復の従来の設定。この場合、ほとんどのピクセルは非常に小さなノイズを追加することができます。 2) 通常はピクセル単位のノイズを操作します。ただし、DNN によって抽出されたピクセルの特徴はその周辺領域の影響を受け、通常、異なる DNN は異なる識別領域に焦点を合わせて認識します。これらの問題に取り組むために、私たちの以前の研究では、高い転送性を持つ敵対的な例を作成することを目的としたパッチごとの反復法 (PIM) を提案しています。具体的には、各反復でステップサイズに増幅係数を導入し、プロジェクトカーネルによって、ピクセルの全体的な勾配が、最低制約をオーバーフローするように周囲の領域に適切に割り当てます。しかし、標的型攻撃は、敵対的な例を特定のクラスの領域に押し込むことを目的としており、増幅率が過小評価につながる可能性があります。したがって、温度を導入し、ホワイトボックス攻撃のパフォーマンスを大幅に犠牲にすることなく転送性をさらに向上させるパッチごとの++反復法 (PIM++) を提案します。私たちの方法は、一般的に勾配ベースの攻撃方法に統合できます。現在の最先端の攻撃手法と比較して、平均して防御モデルで 33.1%、通常の訓練を受けたモデルで 31.4% 成功率を大幅に改善しています。

Although great progress has been made on adversarial attacks for deep neural networks (DNNs), their transferability is still unsatisfactory, especially for targeted attacks. There are two problems behind that have been long overlooked: 1) the conventional setting of T iterations with the step size of ϵ/T to comply with the ϵ-constraint. In this case, most of the pixels are allowed to add very small noise, much less than ϵ; and 2) usually manipulating pixel-wise noise. However, features of a pixel extracted by DNNs are influenced by its surrounding regions, and different DNNs generally focus on different discriminative regions in recognition. To tackle these issues, our previous work proposes a patch-wise iterative method (PIM) aimed at crafting adversarial examples with high transferability. Specifically, we introduce an amplification factor to the step size in each iteration, and one pixel's overall gradient overflowing the ϵ-constraint is properly assigned to its surrounding regions by a project kernel. But targeted attacks aim to push the adversarial examples into the territory of a specific class, and the amplification factor may lead to underfitting. Thus, we introduce the temperature and propose a patch-wise++ iterative method (PIM++) to further improve transferability without significantly sacrificing the performance of the white-box attack. Our method can be generally integrated to any gradient-based attack methods. Compared with the current state-of-the-art attack methods, we significantly improve the success rate by 33.1% for defense models and 31.4% for normally trained models on average.

updated: Tue Jun 08 2021 12:52:44 GMT+0000 (UTC)

published: Thu Dec 31 2020 08:40:42 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト