Is RobustBench/AutoAttack a suitable Benchmark for Adversarial Robustness?

Peter Lorenz; Dominik Strassel; Margret Keuper; Janis Keuper

RobustBench / AutoAttackは、敵対的なロバストネスに適したベンチマークですか？

最近、RobustBench（Croce etal。2020）は、画像分類ネットワークの敵対的なロバスト性について広く認識されているベンチマークになりました。最も一般的に報告されているサブタスクでは、RobustBenchは、AutoAttack（Croce and Hein 2020b）の下でCIFAR10でトレーニングされたニューラルネットワークの敵対的なロバスト性を評価およびランク付けします。l-inf摂動はeps = 8/255に制限されます。ベースラインの約60％の現在最高のパフォーマンスを発揮するモデルの主要なスコアを使用して、このベンチマークを非常に難しいものとして特徴付けることは公正です。最近の文献で一般的に受け入れられているにもかかわらず、実際のアプリケーションに一般化できる堅牢性の重要な指標としてのRobustBenchの適合性についての議論を促進することを目指しています。これに対する私たちの主張は2つあり、この論文で提示された過度の実験によって裏付けられています。単純な検出アルゴリズムと人間の観察者による場合でも、敵対的なサンプルの検出率。また、同様の成功率を達成しながら、他の攻撃方法を検出するのがはるかに難しいことも示しています。 II）CIFAR10のような低解像度のデータセットでの結果は、解像度が上がるにつれて勾配ベースの攻撃がさらに検出可能になるように見えるため、高解像度の画像にうまく一般化されません。

Recently, RobustBench (Croce et al. 2020) has become a widely recognized benchmark for the adversarial robustness of image classification networks. In its most commonly reported sub-task, RobustBench evaluates and ranks the adversarial robustness of trained neural networks on CIFAR10 under AutoAttack (Croce and Hein 2020b) with l-inf perturbations limited to eps = 8/255. With leading scores of the currently best performing models of around 60% of the baseline, it is fair to characterize this benchmark to be quite challenging. Despite its general acceptance in recent literature, we aim to foster discussion about the suitability of RobustBench as a key indicator for robustness which could be generalized to practical applications. Our line of argumentation against this is two-fold and supported by excessive experiments presented in this paper: We argue that I) the alternation of data by AutoAttack with l-inf, eps = 8/255 is unrealistically strong, resulting in close to perfect detection rates of adversarial samples even by simple detection algorithms and human observers. We also show that other attack methods are much harder to detect while achieving similar success rates. II) That results on low-resolution data sets like CIFAR10 do not generalize well to higher resolution images as gradient-based attacks appear to become even more detectable with increasing resolutions.

updated: Thu Dec 02 2021 20:44:16 GMT+0000 (UTC)

published: Thu Dec 02 2021 20:44:16 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト