A Bayes-Optimal View on Adversarial Examples

Eitan Richardson; Yair Weiss

ベイズ-敵対的な例に関する最適な見方

敵対的な例の発見-入力の小さな摂動で現代のCNN分類器をだます能力-それらが現在の神経アーキテクチャとトレーニング方法に固有の「バグ」であるか、またはの避けられない「機能」であるかどうかについて多くの議論がありました。高次元のジオメトリ。この論文では、ベイズの最適分類の観点から敵対的な例を検討することを主張します。ベイズ最適分類器を効率的に計算できる現実的な画像データセットを構築し、これらの分類器が高次元でも敵対的な攻撃に対して確実にロバストである分布の分析条件を導き出します。私たちの結果は、これらの「ゴールドスタンダード」最適分類器が堅牢である場合でも、同じデータセットでトレーニングされたCNNは一貫して脆弱な分類器を学習することを示しており、敵対的な例が回避可能な「バグ」であることが多いことを示しています。さらに、同じデータでトレーニングされたRBF SVMが、堅牢な分類器を一貫して学習することを示します。同じ傾向は、異なるデータセットの実際の画像を使用した実験でも観察されます。

Since the discovery of adversarial examples - the ability to fool modern CNN classifiers with tiny perturbations of the input, there has been much discussion whether they are a "bug" that is specific to current neural architectures and training methods or an inevitable "feature" of high dimensional geometry. In this paper, we argue for examining adversarial examples from the perspective of Bayes-Optimal classification. We construct realistic image datasets for which the Bayes-Optimal classifier can be efficiently computed and derive analytic conditions on the distributions under which these classifiers are provably robust against any adversarial attack even in high dimensions. Our results show that even when these "gold standard" optimal classifiers are robust, CNNs trained on the same datasets consistently learn a vulnerable classifier, indicating that adversarial examples are often an avoidable "bug". We further show that RBF SVMs trained on the same data consistently learn a robust classifier. The same trend is observed in experiments with real images in different datasets.

updated: Wed Mar 17 2021 09:47:10 GMT+0000 (UTC)

published: Thu Feb 20 2020 16:43:47 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト