Learnable Boundary Guided Adversarial Training

Jiequan Cui; Shu Liu; Liwei Wang; Jiaya Jia

学習可能な境界ガイド付き敵対的トレーニング

以前の敵対的なトレーニングは、自然データの精度を犠牲にしてモデルの堅牢性を高めます。この論文では、自然な精度の低下を減らします。よく訓練されたクリーンモデルからのロジットが、一般化可能な分類子境界など、自然データの最も識別力のある特徴を埋め込んでいることを考慮して、1つのクリーンモデルからのモデルロジットを使用して、別の1つの堅牢なモデルの学習をガイドします。私たちの解決策は、敵対的な例を入力として受け取り、対応する自然データが供給されたクリーンモデルのロジットと同様にするロジットをロジットに制約することです。これにより、ロバストモデルはクリーンモデルの分類子境界を継承できます。さらに、このような境界ガイダンスは、高い自然精度を維持できるだけでなく、モデルの堅牢性にもメリットがあり、新しい洞察を提供し、敵対コミュニティの進歩を促進します。最後に、CIFAR-10、CIFAR-100、およびTiny ImageNetに関する広範な実験により、この方法の有効性が証明されています。自動攻撃ベンチマークhttps://github.com/fra31/auto-attackを使用して、実際のデータや合成データを追加することなく、CIFAR-100で新しい最先端の堅牢性を実現します。私たちのコードはhttps://github.com/dvlab-research/LBGATで入手できます。

Previous adversarial training raises model robustness under the compromise of accuracy on natural data. In this paper, we reduce natural accuracy degradation. We use the model logits from one clean model to guide learning of another one robust model, taking into consideration that logits from the well trained clean model embed the most discriminative features of natural data, e.g., generalizable classifier boundary. Our solution is to constrain logits from the robust model that takes adversarial examples as input and makes it similar to those from the clean model fed with corresponding natural data. It lets the robust model inherit the classifier boundary of the clean model. Moreover, we observe such boundary guidance can not only preserve high natural accuracy but also benefit model robustness, which gives new insights and facilitates progress for the adversarial community. Finally, extensive experiments on CIFAR-10, CIFAR-100, and Tiny ImageNet testify to the effectiveness of our method. We achieve new state-of-the-art robustness on CIFAR-100 without additional real or synthetic data with auto-attack benchmark https://github.com/fra31/auto-attack. Our code is available at https://github.com/dvlab-research/LBGAT.

updated: Mon Aug 16 2021 04:40:26 GMT+0000 (UTC)

published: Mon Nov 23 2020 01:36:05 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト