CARSO: Counter-Adversarial Recall of Synthetic Observations

Emanuele Ballarin; Alessio Ansuini; Luca Bortolussi

CARSO: 合成観察の反敵的リコール

この論文では、認知神経科学からの手がかりに触発された、画像分類のための新しい敵対的防御メカニズムである CARSO を提案します。この方法は、敵対的トレーニングを相乗的に補完し、攻撃された分類子の内部表現の知識に依存します。そのような表現を条件とした敵対的浄化の生成モデルを利用して、最終的に分類される入力の再構築をサンプリングします。多様な画像データセットと分類器アーキテクチャにわたる、多様で強力な適応型攻撃の確立されたベンチマークによる実験評価では、CARSO が最先端の敵対的トレーニングのみよりもはるかに優れて分類器を防御できることが示されています。きれいな精度通行料金。さらに、防御アーキテクチャは、予期せぬ脅威や、確率的防御を欺くように適応されたエンドツーエンドの攻撃から効果的に防御することに成功しています。コードと事前トレーニングされたモデルは https://github.com/emaballarin/CARSO で入手できます。

In this paper, we propose a novel adversarial defence mechanism for image classification -- CARSO -- inspired by cues from cognitive neuroscience. The method is synergistically complementary to adversarial training and relies on knowledge of the internal representation of the attacked classifier. Exploiting a generative model for adversarial purification, conditioned on such representation, it samples reconstructions of inputs to be finally classified. Experimental evaluation by a well-established benchmark of varied, strong adaptive attacks, across diverse image datasets and classifier architectures, shows that CARSO is able to defend the classifier significantly better than state-of-the-art adversarial training alone -- with a tolerable clean accuracy toll. Furthermore, the defensive architecture succeeds in effectively shielding itself from unforeseen threats, and end-to-end attacks adapted to fool stochastic defences. Code and pre-trained models are available at https://github.com/emaballarin/CARSO .

updated: Wed Jun 14 2023 00:28:09 GMT+0000 (UTC)

published: Thu May 25 2023 09:04:31 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト