PatchCleanser: Certifiably Robust Defense against Adversarial Patches for Any Image Classifier

Chong Xiang; Saeed Mahloujifar; Prateek Mittal

PatchCleanser：あらゆる画像分類器の敵対的なパッチに対する確実に堅牢な防御

画像分類モデルに対する敵対的なパッチ攻撃は、モデルの誤分類を誘発するために、制限された画像領域（つまりパッチ）内に敵対的に細工されたピクセルを注入することを目的としています。この攻撃は、パッチを印刷して被害者のオブジェクトに添付することにより、現実の世界で実現できます。したがって、それはコンピュータビジョンシステムに現実世界の脅威を課します。この脅威に対抗するために、PatchCleanserを、敵対的なパッチに対する確実に堅牢な防御として設計します。 PatchCleanserでは、敵対的なパッチの影響を中和するために、入力画像に対して2ラウンドのピクセルマスキングを実行します。この画像空間操作により、PatchCleanserは最先端の画像分類器と互換性があり、高精度を実現します。さらに、PatchCleanserが、脅威モデル内の適応型ホワイトボックス攻撃者に対して、特定の画像の正しいクラスラベルを常に予測し、認定された堅牢性を実現することを証明できます。 ImageNet、ImageNette、CIFAR-10、CIFAR-100、SVHN、Flowers-102データセットでPatchCleanserを広範囲に評価し、防御が最先端の分類モデルと同様のクリーンな精度を達成し、認定された堅牢性を大幅に向上させることを実証します以前の作品から。驚くべきことに、PatchCleanserは、1000クラスのImageNetデータセットの画像上の2％ピクセルの正方形のパッチに対して、83.9％のトップ1クリーン精度と62.1％のトップ1認定ロバスト精度を達成します。

The adversarial patch attack against image classification models aims to inject adversarially crafted pixels within a restricted image region (i.e., a patch) for inducing model misclassification. This attack can be realized in the physical world by printing and attaching the patch to the victim object; thus, it imposes a real-world threat to computer vision systems. To counter this threat, we design PatchCleanser as a certifiably robust defense against adversarial patches. In PatchCleanser, we perform two rounds of pixel masking on the input image to neutralize the effect of the adversarial patch. This image-space operation makes PatchCleanser compatible with any state-of-the-art image classifier for achieving high accuracy. Furthermore, we can prove that PatchCleanser will always predict the correct class labels on certain images against any adaptive white-box attacker within our threat model, achieving certified robustness. We extensively evaluate PatchCleanser on the ImageNet, ImageNette, CIFAR-10, CIFAR-100, SVHN, and Flowers-102 datasets and demonstrate that our defense achieves similar clean accuracy as state-of-the-art classification models and also significantly improves certified robustness from prior works. Remarkably, PatchCleanser achieves 83.9% top-1 clean accuracy and 62.1% top-1 certified robust accuracy against a 2%-pixel square patch anywhere on the image for the 1000-class ImageNet dataset.

updated: Fri Apr 08 2022 18:52:45 GMT+0000 (UTC)

published: Fri Aug 20 2021 12:09:33 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト