Sparta: Spatially Attentive and Adversarially Robust Activation

Qing Guo; Felix Juefei-Xu; Changqing Zhou; Wei Feng; Yang Liu; Song Wang

スパルタ：空間的に注意深く、敵対的にロバストなアクティベーション

敵対的トレーニング（AT）は、深い畳み込みニューラルネットワーク（CNN）の堅牢性を向上させるための最も効果的な方法の1つです。一般的なネットワークトレーニングと同様に、ATの有効性は基本的なネットワークコンポーネントの設計に依存します。このホワイトペーパーでは、堅牢なCNNのATにおける基本的なReLUアクティベーションコンポーネントの役割について詳細に調査します。 ReLUアクティベーションの空間的に共有され、入力に依存しないプロパティにより、CNNは、標準トレーニングまたは敵対的トレーニングのいずれかを使用したホワイトボックスの敵対的攻撃に対する堅牢性が低下することがわかります。この問題に対処するために、ReLUを新しいSpartaアクティベーション関数（空間的に注意深く、敵対的にロバストなアクティベーション）に拡張します。これにより、CNNは、より高いロバスト性、つまり敵対的な例でのより低いエラー率と、より高い精度、つまりより低いエラー率の両方を実現できます。クリーンな例では、既存の最先端（SOTA）のアクティブ化関数よりも。 SpartaとSOTA活性化関数の関係をさらに研究し、私たちの方法の利点についてより多くの洞察を提供します。包括的な実験により、提案された方法が優れたクロスCNNおよびクロスデータセット転送可能性を示すこともわかりました。前者の場合、1つのCNN（ResNet-18など）に対して敵対的にトレーニングされたSparta関数を修正し、別の敵対的に堅牢なCNN（ResNet-34など）をトレーニングするために直接使用できます。後者の場合、1つのデータセット（CIFAR-10など）でトレーニングされたSparta関数を使用して、別のデータセット（SVHNなど）で敵対的に堅牢なCNNをトレーニングできます。どちらの場合も、SpartaはバニラReLUよりも高いロバスト性を備えたCNNにつながり、提案された方法の柔軟性と多様性を検証します。

Adversarial training (AT) is one of the most effective ways for improving the robustness of deep convolution neural networks (CNNs). Just like common network training, the effectiveness of AT relies on the design of basic network components. In this paper, we conduct an in-depth study on the role of the basic ReLU activation component in AT for robust CNNs. We find that the spatially-shared and input-independent properties of ReLU activation make CNNs less robust to white-box adversarial attacks with either standard or adversarial training. To address this problem, we extend ReLU to a novel Sparta activation function (Spatially attentive and Adversarially Robust Activation), which enables CNNs to achieve both higher robustness, i.e., lower error rate on adversarial examples, and higher accuracy, i.e., lower error rate on clean examples, than the existing state-of-the-art (SOTA) activation functions. We further study the relationship between Sparta and the SOTA activation functions, providing more insights about the advantages of our method. With comprehensive experiments, we also find that the proposed method exhibits superior cross-CNN and cross-dataset transferability. For the former, the adversarially trained Sparta function for one CNN (e.g., ResNet-18) can be fixed and directly used to train another adversarially robust CNN (e.g., ResNet-34). For the latter, the Sparta function trained on one dataset (e.g., CIFAR-10) can be employed to train adversarially robust CNNs on another dataset (e.g., SVHN). In both cases, Sparta leads to CNNs with higher robustness than the vanilla ReLU, verifying the flexibility and versatility of the proposed method.

updated: Sat Dec 03 2022 11:56:46 GMT+0000 (UTC)

published: Tue May 18 2021 04:36:46 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト