Demystifying the Adversarial Robustness of Random Transformation Defenses

Chawin Sitawarin; Zachary Golan-Strieb; David Wagner

ランダムトランスフォーメーションディフェンスの敵対的なロバスト性をわかりやすく説明する

ニューラルネットワークの攻撃に対する堅牢性の欠如は、自動運転車などのセキュリティに敏感な設定で懸念を引き起こします。多くの対策は有望に見えるかもしれませんが、厳密な評価に耐えるのはごくわずかです。ランダム変換（RT）を使用した防御は、ImageNetで特にBaRT（Raff et al。、2019）という印象的な結果を示しています。ただし、このタイプの防御は厳密に評価されておらず、その堅牢性の特性は十分に理解されていません。それらの確率的特性は、評価をより困難にし、決定論的モデルに対する多くの提案された攻撃を適用不可能にします。まず、BaRTの評価で使用されたBPDA攻撃（Athalye et al。、2018a）は効果がなく、その堅牢性を過大評価している可能性があることを示します。次に、情報に基づいた変換の選択と、パラメーターを調整するためのベイズ最適化を通じて、可能な限り強力なRT防御を構築しようとします。さらに、RT防御を評価するために、可能な限り強力な攻撃を作成します。新しい攻撃はベースラインを大幅に上回り、一般的に使用されるEoT攻撃による19％の低下（4.3倍の改善）と比較して、精度が83％低下します。私たちの結果は、Imagenetteデータセット（ImageNetの10クラスのサブセット）のRT防御は、敵対的な例に対して堅牢ではないことを示しています。調査をさらに拡張して、新しい攻撃を使用してRT防御（AdvRTと呼ばれる）を敵対的に訓練し、堅牢性を大幅に向上させます。コードはhttps://github.com/wagner-group/demystify-random-transformで入手できます。

Neural networks' lack of robustness against attacks raises concerns in security-sensitive settings such as autonomous vehicles. While many countermeasures may look promising, only a few withstand rigorous evaluation. Defenses using random transformations (RT) have shown impressive results, particularly BaRT (Raff et al., 2019) on ImageNet. However, this type of defense has not been rigorously evaluated, leaving its robustness properties poorly understood. Their stochastic properties make evaluation more challenging and render many proposed attacks on deterministic models inapplicable. First, we show that the BPDA attack (Athalye et al., 2018a) used in BaRT's evaluation is ineffective and likely overestimates its robustness. We then attempt to construct the strongest possible RT defense through the informed selection of transformations and Bayesian optimization for tuning their parameters. Furthermore, we create the strongest possible attack to evaluate our RT defense. Our new attack vastly outperforms the baseline, reducing the accuracy by 83% compared to the 19% reduction by the commonly used EoT attack (4.3× improvement). Our result indicates that the RT defense on the Imagenette dataset (a ten-class subset of ImageNet) is not robust against adversarial examples. Extending the study further, we use our new attack to adversarially train RT defense (called AdvRT), resulting in a large robustness gain. Code is available at https://github.com/wagner-group/demystify-random-transform.

updated: Fri Jul 15 2022 10:19:22 GMT+0000 (UTC)

published: Sat Jun 18 2022 04:14:38 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト