On the Robustness of Vision Transformers to Adversarial Examples

Kaleel Mahmood; Rigel Mahmood; Marten van Dijk

敵対的な例に対するビジョントランスフォーマーのロバスト性について

注意に基づくネットワークの最近の進歩は、ビジョントランスフォーマーが多くの画像分類タスクで最先端または最先端の結果を達成できることを示しています。これにより、トランスフォーマーは、従来の畳み込みニューラルネットワーク (CNN) に代わる有望な代替手段として独自の位置に置かれます。 CNN は敵対的攻撃に関して注意深く研究されてきましたが、ビジョントランスフォーマーについては同じことが言えません。この論文では、敵対的な例に対する Vision Transformers の堅牢性を研究します。変圧器のセキュリティに関する私たちの分析は、3 つの部分に分かれています。まず、標準のホワイトボックスおよびブラックボックス攻撃の下でトランスフォーマーをテストします。次に、CNN とトランスフォーマー間の敵対的な例の転送可能性を研究します。敵対的な例は、CNN とトランスフォーマーの間で容易に転送されないことを示しています。この発見に基づいて、CNN とトランスフォーマーの単純なアンサンブル防御のセキュリティを分析します。新しい攻撃である自己注意混合勾配攻撃を作成することにより、そのようなアンサンブルがホワイトボックスの敵対者の下で安全でないことを示します。ただし、ブラックボックスの敵対者の下では、アンサンブルがクリーンな精度を犠牲にすることなく前例のない堅牢性を達成できることを示しています。この作業の分析は、6 種類のホワイトボックス攻撃と 2 種類のブラックボックス攻撃を使用して行われます。私たちの調査には、CIFAR-10、CIFAR-100、および ImageNet でトレーニングされた複数のビジョントランスフォーマー、ビッグトランスファーモデル、および CNN アーキテクチャが含まれています。

Recent advances in attention-based networks have shown that Vision Transformers can achieve state-of-the-art or near state-of-the-art results on many image classification tasks. This puts transformers in the unique position of being a promising alternative to traditional convolutional neural networks (CNNs). While CNNs have been carefully studied with respect to adversarial attacks, the same cannot be said of Vision Transformers. In this paper, we study the robustness of Vision Transformers to adversarial examples. Our analyses of transformer security is divided into three parts. First, we test the transformer under standard white-box and black-box attacks. Second, we study the transferability of adversarial examples between CNNs and transformers. We show that adversarial examples do not readily transfer between CNNs and transformers. Based on this finding, we analyze the security of a simple ensemble defense of CNNs and transformers. By creating a new attack, the self-attention blended gradient attack, we show that such an ensemble is not secure under a white-box adversary. However, under a black-box adversary, we show that an ensemble can achieve unprecedented robustness without sacrificing clean accuracy. Our analysis for this work is done using six types of white-box attacks and two types of black-box attacks. Our study encompasses multiple Vision Transformers, Big Transfer Models and CNN architectures trained on CIFAR-10, CIFAR-100 and ImageNet.

updated: Sat Jun 05 2021 00:31:29 GMT+0000 (UTC)

published: Wed Mar 31 2021 00:29:12 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト