Identifying Layers Susceptible to Adversarial Attacks

Shoaib Ahmed Siddiqui; Thomas Breuel

敵対的な攻撃を受けやすいレイヤーの特定

Identifying Layers Susceptible to Adversarial Attacks

この論文では、ネットワークの深さとロバスト性の関係を発見することを目的として、敵対的ネットワークでの事前トレーニングの使用を調査します。この目的のために、非敵対的および敵対的データを使用して、CIFAR-10、Imagenette、およびImageNet上のVGGおよびResNetアーキテクチャのさまざまな部分を選択的に再トレーニングします。実験結果は、敵対的なサンプルに対する感受性が低レベルの特徴抽出層に関連していることを示しています。したがって、高レベルのレイヤーの再トレーニングは、堅牢性を実現するには不十分です。さらに、敵対的攻撃は、非敵対的サンプルの特徴と統計的に異なる初期層からの出力を生成し、後続の層による一貫した分類を許可しません。これは、ロバスト性と特徴抽出器との関連、ロバスト性を提供する上でのより深い層の不十分さ、および敵対的および非敵対的特徴ベクトルの大きな違いに関する一般的な仮説をサポートします。

In this paper, we investigate the use of pretraining with adversarial networks, with the objective of discovering the relationship between network depth and robustness. For this purpose, we selectively retrain different portions of VGG and ResNet architectures on CIFAR-10, Imagenette, and ImageNet using non-adversarial and adversarial data. Experimental results show that susceptibility to adversarial samples is associated with low-level feature extraction layers. Therefore, retraining of high-level layers is insufficient for achieving robustness. Furthermore, adversarial attacks yield outputs from early layers that differ statistically from features for non-adversarial samples and do not permit consistent classification by subsequent layers. This supports common hypotheses regarding the association of robustness with the feature extractor, insufficiency of deeper layers in providing robustness, and large differences in adversarial and non-adversarial feature vectors.

updated: Fri Oct 29 2021 00:26:34 GMT+0000 (UTC)

published: Sat Jul 10 2021 12:38:49 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト