On the Properties of Adversarially-Trained CNNs

Mattia Carletti; Matteo Terzi; Gian Antonio Susto

敵対的に訓練されたCNNの特性について

敵対的トレーニングは、現代のニューラルネットワークアーキテクチャにおける敵対的例に対して堅牢性を強化するための効果的なトレーニングパラダイムであることが証明されています。多くの努力にもかかわらず、敵対的訓練の有効性を支える基本原則の説明は限られており、ディープラーニングコミュニティによって広く受け入れられるにはほど遠いです。この論文では、敵対的に訓練されたモデルの驚くべき特性について説明し、敵対的な攻撃に対する堅牢性が実装されるメカニズムに光を当てます。さらに、以前の作業では説明されていなかった、これらのモデルに影響を与える制限と障害モードを強調します。幅広いアーキテクチャとデータセットについて広範な分析を行い、堅牢なモデルと自然なモデルを詳細に比較します。

Adversarial Training has proved to be an effective training paradigm to enforce robustness against adversarial examples in modern neural network architectures. Despite many efforts, explanations of the foundational principles underpinning the effectiveness of Adversarial Training are limited and far from being widely accepted by the Deep Learning community. In this paper, we describe surprising properties of adversarially-trained models, shedding light on mechanisms through which robustness against adversarial attacks is implemented. Moreover, we highlight limitations and failure modes affecting these models that were not discussed by prior works. We conduct extensive analyses on a wide range of architectures and datasets, performing a deep comparison between robust and natural models.

updated: Thu Mar 17 2022 11:11:52 GMT+0000 (UTC)

published: Thu Mar 17 2022 11:11:52 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト