Bridging the Performance Gap between FGSM and PGD Adversarial Training

Tianjin Huang; Vlado Menkovski; Yulong Pei; Mykola Pechenizkiy

FGSMとPGDの敵対的トレーニングの間のパフォーマンスギャップを埋める

ディープラーニングは、多くのタスクで最先端のパフォーマンスを実現しますが、敵対的な例に対する根本的な脆弱性にさらされます。既存の防御技術全体で、予測勾配降下法攻撃（adv.PGD）を使用した敵対訓練は、中程度の敵対的ロバスト性を達成するための最も効果的な方法の1つと見なされています。ただし、予測勾配攻撃（PGD）は摂動を生成するために複数の反復を必要とするため、adv.PGDはトレーニングに時間がかかりすぎます。一方、高速勾配符号法（adv.FGSM）を使用した敵対的トレーニングは、高速勾配符号法（FGSM）が摂動を生成するために1ステップかかるが、敵対的ロバスト性を高めることができないため、トレーニング時間がはるかに短くなります。この作業では、adv.FGSMを拡張して、adv.PGDの敵対的な堅牢性を実現します。 FGSM摂動方向に沿った大きな曲率は、adv.FGSMとadv.PGDの間の敵対的ロバスト性のパフォーマンスに大きな違いをもたらすことを示し、したがって、adv.FGSMを曲率正則化（adv.FGSMR）と組み合わせてブリッジすることを提案します。 adv.FGSMとadv.PGDの間のパフォーマンスギャップ。実験は、adv.FGSMRがadv.PGDよりも高いトレーニング効率を持っていることを示しています。さらに、ホワイトボックス攻撃下でMNISTデータセットに対して同等のパフォーマンスを実現し、ホワイトボックス攻撃下でadv.PGDよりも優れたパフォーマンスを実現し、CIFAR-10データセットに対する転送可能な敵対攻撃を効果的に防御します。

Deep learning achieves state-of-the-art performance in many tasks but exposes to the underlying vulnerability against adversarial examples. Across existing defense techniques, adversarial training with the projected gradient decent attack (adv.PGD) is considered as one of the most effective ways to achieve moderate adversarial robustness. However, adv.PGD requires too much training time since the projected gradient attack (PGD) takes multiple iterations to generate perturbations. On the other hand, adversarial training with the fast gradient sign method (adv.FGSM) takes much less training time since the fast gradient sign method (FGSM) takes one step to generate perturbations but fails to increase adversarial robustness. In this work, we extend adv.FGSM to make it achieve the adversarial robustness of adv.PGD. We demonstrate that the large curvature along FGSM perturbed direction leads to a large difference in performance of adversarial robustness between adv.FGSM and adv.PGD, and therefore propose combining adv.FGSM with a curvature regularization (adv.FGSMR) in order to bridge the performance gap between adv.FGSM and adv.PGD. The experiments show that adv.FGSMR has higher training efficiency than adv.PGD. In addition, it achieves comparable performance of adversarial robustness on MNIST dataset under white-box attack, and it achieves better performance than adv.PGD under white-box attack and effectively defends the transferable adversarial attack on CIFAR-10 dataset.

updated: Sat Nov 07 2020 09:08:54 GMT+0000 (UTC)

published: Sat Nov 07 2020 09:08:54 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト