PDPGD: Primal-Dual Proximal Gradient Descent Adversarial Attack

Alexander Matyasko; Lap-Pui Chau

最先端のディープニューラルネットワークは、小さな入力の変動に敏感です。この興味深い脆弱性が発見されて以来、敵対的なノイズに対する堅牢性の向上を試みる多くの防御方法が提案されてきました。さまざまな防御方法を比較するには、迅速かつ正確な攻撃が必要です。ただし、敵対者の堅牢性を評価することは非常に困難であることが証明されています。既存のノルム最小化の敵対的攻撃は、数千回の反復を必要とし (例: カルリーニ & ワーグナー攻撃)、特定のノルムに限定され (例: 高速適応境界)、最適ではない結果を生成します (例: ブレンデル & ベスゲ攻撃)。一方、高速で一般的で正確な PGD 攻撃は、ノルム最小化ペナルティを無視して、より単純な摂動制約問題を解決します。この作業では、元の非凸制約付き最小化問題を最適化する、高速で一般的で正確な敵対的攻撃を紹介します。私たちは、敵対的攻撃の最適化問題のラグランジュの最適化を 2 プレイヤーゲームとして解釈します。最初のプレイヤーは、敵対的ノイズに対してラグランジュを最小化します。 2 番目のプレーヤーは、正則化ペナルティに対してラグランジアンを最大化します。私たちの攻撃アルゴリズムは、主変数と双対変数を同時に最適化して、最小の敵対的摂動を見つけます。さらに、l_∞-、l_1-、l_0-norms などの非滑らかな l_p-norm の最小化については、プライマリデュアル近位勾配降下攻撃を導入します。実験では、私たちの攻撃が、正規化されていない敵対的に訓練されたモデルに対して、MNIST、CIFAR-10、および制限付き ImageNet データセットに対する現在の最先端の l_∞-、l_2-、l_1-、および l_0-攻撃よりも優れていることを示しています。

State-of-the-art deep neural networks are sensitive to small input perturbations. Since the discovery of this intriguing vulnerability, many defence methods have been proposed that attempt to improve robustness to adversarial noise. Fast and accurate attacks are required to compare various defence methods. However, evaluating adversarial robustness has proven to be extremely challenging. Existing norm minimisation adversarial attacks require thousands of iterations (e.g. Carlini & Wagner attack), are limited to the specific norms (e.g. Fast Adaptive Boundary), or produce sub-optimal results (e.g. Brendel & Bethge attack). On the other hand, PGD attack, which is fast, general and accurate, ignores the norm minimisation penalty and solves a simpler perturbation-constrained problem. In this work, we introduce a fast, general and accurate adversarial attack that optimises the original non-convex constrained minimisation problem. We interpret optimising the Lagrangian of the adversarial attack optimisation problem as a two-player game: the first player minimises the Lagrangian wrt the adversarial noise; the second player maximises the Lagrangian wrt the regularisation penalty. Our attack algorithm simultaneously optimises primal and dual variables to find the minimal adversarial perturbation. In addition, for non-smooth l_p-norm minimisation, such as l_∞-, l_1-, and l_0-norms, we introduce primal-dual proximal gradient descent attack. We show in the experiments that our attack outperforms current state-of-the-art l_∞-, l_2-, l_1-, and l_0-attacks on MNIST, CIFAR-10 and Restricted ImageNet datasets against unregularised and adversarially trained models.

updated: Thu Jun 03 2021 01:45:48 GMT+0000 (UTC)

published: Thu Jun 03 2021 01:45:48 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト