Enhancing Adversarial Defense by k-Winners-Take-All

Chang Xiao; Peilin Zhong; Changxi Zheng

k-Winners-Take-Allによる敵対防御の強化

勾配ベースの敵対攻撃に対する防御を向上させるために、既存のニューラルネットワーク構造を簡単に変更することを提案する。一般的な活性化関数（ReLUなど）を使用する代わりに、k-Winners-Take-All（k-WTA）活性化の使用を提唱する。提案されたk-WTA活性化は、既存のほぼすべてのネットワークと学習方法で、大きなオーバーヘッドなしで容易に使用することができる。我々の提案は理論的に合理化されている。なぜk-WTAネットワークの不連続性が、敵対的な例の勾配ベースの探索を大きく妨げることができるのか、また、なぜ同時にネットワークの訓練には無害なままであるのかを分析する。この理解は経験的にも裏付けられている。学習法によって最適化された様々なネットワーク構造に対してk-WTAの活性化をテストする。すべての場合において、k-WTAネットワークのロバスト性は、ホワイトボックス攻撃の下で従来のネットワークのロバスト性を上回る。

We propose a simple change to existing neural network structures for better defending against gradient-based adversarial attacks. Instead of using popular activation functions (such as ReLU), we advocate the use of k-Winners-Take-All (k-WTA) activation, a C0 discontinuous function that purposely invalidates the neural network model's gradient at densely distributed input data points. The proposed k-WTA activation can be readily used in nearly all existing networks and training methods with no significant overhead. Our proposal is theoretically rationalized. We analyze why the discontinuities in k-WTA networks can largely prevent gradient-based search of adversarial examples and why they at the same time remain innocuous to the network training. This understanding is also empirically backed. We test k-WTA activation on various network structures optimized by a training method, be it adversarial training or not. In all cases, the robustness of k-WTA networks outperforms that of traditional networks under white-box attacks.

updated: Tue Oct 29 2019 00:27:18 GMT+0000 (UTC)

published: Sat May 25 2019 03:36:40 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト