Diffusion Models for Adversarial Purification

Weili Nie; Brandon Guo; Yujia Huang; Chaowei Xiao; Arash Vahdat; Anima Anandkumar

敵対的浄化のための拡散モデル

敵対的浄化とは、生成モデルを使用して敵対的摂動を除去する防御方法のクラスを指します。これらの方法は、攻撃の形式や分類モデルを想定していないため、既存の分類子を目に見えない脅威から守ることができます。しかし、彼らのパフォーマンスは現在、敵対的な訓練方法に遅れをとっています。この作業では、敵対的浄化に拡散モデルを使用するDiffPureを提案します。敵対的例を前提として、最初に順方向拡散プロセスに続いて少量のノイズで拡散し、次に逆方向生成プロセスによってクリーンな画像を復元します。効率的かつスケーラブルな方法で強力な適応攻撃に対して私たちの方法を評価するために、随伴法を使用して逆生成プロセスの完全な勾配を計算することを提案します。 ResNet、WideResNet、ViTを含む3つの分類アーキテクチャを備えたCIFAR-10、ImageNet、CelebA-HQを含む3つの画像データセットでの広範な実験は、私たちの方法が最新の結果を達成し、現在の敵対者のトレーニングと敵対者の浄化方法を上回っていることを示しています。多くの場合、大幅な差があります。プロジェクトページ：https：//diffpure.github.io。

Adversarial purification refers to a class of defense methods that remove adversarial perturbations using a generative model. These methods do not make assumptions on the form of attack and the classification model, and thus can defend pre-existing classifiers against unseen threats. However, their performance currently falls behind adversarial training methods. In this work, we propose DiffPure that uses diffusion models for adversarial purification: Given an adversarial example, we first diffuse it with a small amount of noise following a forward diffusion process, and then recover the clean image through a reverse generative process. To evaluate our method against strong adaptive attacks in an efficient and scalable way, we propose to use the adjoint method to compute full gradients of the reverse generative process. Extensive experiments on three image datasets including CIFAR-10, ImageNet and CelebA-HQ with three classifier architectures including ResNet, WideResNet and ViT demonstrate that our method achieves the state-of-the-art results, outperforming current adversarial training and adversarial purification methods, often by a large margin. Project page: https://diffpure.github.io.

updated: Mon May 16 2022 06:03:00 GMT+0000 (UTC)

published: Mon May 16 2022 06:03:00 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト