Improving Adversarial Robustness Using Proxy Distributions

Vikash Sehwag; Saeed Mahloujifar; Tinashe Handina; Sihui Dai; Chong Xiang; Mung Chiang; Prateek Mittal

プロキシ配布を使用した敵対的ロバスト性の改善

画像分類における敵対的ロバスト性の理解と改善の両方において、プロキシ分布、つまりトレーニングデータセットの基礎となる分布の近似の使用に焦点を当てます。追加のトレーニングデータは敵対的なトレーニングに役立ちますが、非常に多くの実世界の画像をキュレートすることは困難です。対照的に、プロキシ配布では、潜在的に無制限の数の画像をサンプリングし、これらのサンプルを使用して敵対者の堅牢性を向上させることができます。最初に質問します。トレーニング段階でプロキシ配布から追加のサンプルを組み込むことで、敵対的な堅牢性が得られるのはいつですか。プロキシ上の分類器のロバスト性と元のトレーニングデータセット分布の違いは、それらの間の条件付きワッサースタイン距離によって上限が定められていることを証明します。私たちの結果は、トレーニングデータセットの分布を厳密に近似するプロキシ分布からのサンプルが敵対者のロバスト性を高めることができるはずであるという直感を確認しています。この発見に動機付けられて、トレーニングデータの分布を厳密に近似できる最先端の生成モデルからのサンプルを活用して、堅牢性を向上させます。特に、l_∞およびl_2脅威モデルでロバスト精度を最大6.1％および5.7％向上させ、CIFAR-10データセットでプロキシ分布を使用しないベースラインよりもロバスト精度を6.7％向上させました。プロキシ配布から無制限の数の画像をサンプリングできるため、敵の堅牢性に対するトレーニングサンプルの数の増加の影響を調査することもできます。ここでは、2Kから10Mの画像でディープニューラルネットワークをトレーニングすることにより、精度と堅牢性のトレードオフおよび敵対的トレーニングのサンプルの複雑さに関する最初の大規模な実証的調査を提供します。

We focus on the use of proxy distributions, i.e., approximations of the underlying distribution of the training dataset, in both understanding and improving the adversarial robustness in image classification. While additional training data helps in adversarial training, curating a very large number of real-world images is challenging. In contrast, proxy distributions enable us to sample a potentially unlimited number of images and improve adversarial robustness using these samples. We first ask the question: when does adversarial robustness benefit from incorporating additional samples from the proxy distribution in the training stage? We prove that the difference between the robustness of a classifier on the proxy and original training dataset distribution is upper bounded by the conditional Wasserstein distance between them. Our result confirms the intuition that samples from a proxy distribution that closely approximates training dataset distribution should be able to boost adversarial robustness. Motivated by this finding, we leverage samples from state-of-the-art generative models, which can closely approximate training data distribution, to improve robustness. In particular, we improve robust accuracy by up to 6.1% and 5.7% in l_∞ and l_2 threat model, and certified robust accuracy by 6.7% over baselines not using proxy distributions on the CIFAR-10 dataset. Since we can sample an unlimited number of images from a proxy distribution, it also allows us to investigate the effect of an increasing number of training samples on adversarial robustness. Here we provide the first large scale empirical investigation of accuracy vs robustness trade-off and sample complexity of adversarial training by training deep neural networks on 2K to 10M images.

updated: Mon Apr 19 2021 16:17:12 GMT+0000 (UTC)

published: Mon Apr 19 2021 16:17:12 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト