Intriguing properties of adversarial training at scale

Cihang Xie; Alan Yuille

大規模な敵対訓練の興味深い特性

敵の訓練は、敵の攻撃に対する主要な防御策の1つです。この論文では、敵対的訓練の診断要素に関する最初の厳密な研究を提供し、2つの興味深い特性を明らかにします。まず、正規化の役割を研究します。バッチ正規化（BN）は、多くのビジョンタスクで最先端のパフォーマンスを達成するための重要な要素ですが、ネットワークが敵のトレーニングで強力な堅牢性を獲得できない可能性があることを示しています。予想外の観察結果の1つは、BNでトレーニングされたモデルの場合、トレーニングデータからきれいな画像を削除するだけで、敵の堅牢性、つまり18.3％が大幅に向上するということです。この現象は、クリーンな画像と敵対的な画像が2つの異なる領域から引き出されるという仮説に関連付けられています。この2つのドメインの仮説は、この混合分布の正規化統計を推定するのが難しいため、クリーンな画像と敵対的な画像の混合でトレーニングする場合のBNの問題を説明するかもしれません。この2つのドメインの仮説に導かれ、正規化のために混合分布を解きほぐします。つまり、統計推定のために別々のBNをクリーンおよび敵対画像に適用すると、はるかに強力なロバストネスが達成されます。さらに、トレーニングとテストでBNが一貫して動作するように強制すると、堅牢性がさらに向上することがわかります。第二に、ネットワーク容量の役割を研究します。いわゆる「ディープ」ネットワークは、敵対的学習のタスクにはまだ浅いことがわかります。「深い」ネットワーク（ResNet-152など）にレイヤーを追加することによって精度がわずかに向上するだけの従来の分類タスクとは異なり、敵対者のトレーニングは、より高い敵対者の堅牢性を達成するために、より深いネットワークでより強い要求を示します。この堅牢性の向上は、ネットワーク容量を前例のない規模、つまりResNet-638に押し上げても、実質的かつ一貫して観察できます。

Adversarial training is one of the main defenses against adversarial attacks. In this paper, we provide the first rigorous study on diagnosing elements of adversarial training, which reveals two intriguing properties. First, we study the role of normalization. Batch normalization (BN) is a crucial element for achieving state-of-the-art performance on many vision tasks, but we show it may prevent networks from obtaining strong robustness in adversarial training. One unexpected observation is that, for models trained with BN, simply removing clean images from training data largely boosts adversarial robustness, i.e., 18.3%. We relate this phenomenon to the hypothesis that clean images and adversarial images are drawn from two different domains. This two-domain hypothesis may explain the issue of BN when training with a mixture of clean and adversarial images, as estimating normalization statistics of this mixture distribution is challenging. Guided by this two-domain hypothesis, we show disentangling the mixture distribution for normalization, i.e., applying separate BNs to clean and adversarial images for statistics estimation, achieves much stronger robustness. Additionally, we find that enforcing BNs to behave consistently at training and testing can further enhance robustness. Second, we study the role of network capacity. We find our so-called "deep" networks are still shallow for the task of adversarial learning. Unlike traditional classification tasks where accuracy is only marginally improved by adding more layers to "deep" networks (e.g., ResNet-152), adversarial training exhibits a much stronger demand on deeper networks to achieve higher adversarial robustness. This robustness improvement can be observed substantially and consistently even by pushing the network capacity to an unprecedented scale, i.e., ResNet-638.

updated: Sat Dec 21 2019 20:48:24 GMT+0000 (UTC)

published: Mon Jun 10 2019 03:41:52 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト