CG-ATTACK: Modeling the Conditional Distribution of Adversarial Perturbations to Boost Black-Box Attack

Yan Feng; Baoyuan Wu; Yanbo Fan; Li Liu; Zhifeng Li; Shutao Xia

CG-ATTACK：ブラックボックス攻撃を後押しするための敵対的摂動の条件付き分布のモデル化

ディープニューラルネットワーク（DNN）に対する敵対的な例は、近年広く開発されています。敵対的摂動の分布をモデル化することは、特にブラックボックスの敵対的攻撃のシナリオにおいて、敵対的摂動を生成する上で重要な役割を果たす可能性があります。しかし、私たちが知る限り、敵対的な分布が研究されることはめったにありません。この目的のために、良性の例が与えられた敵対的摂動の条件付き分布を、複雑なデータ分布をキャプチャする強力な能力を示す条件付き生成フローモデル（c-Glow）で近似することを提案します。ただし、最尤推定によるc-Glowの標準的なトレーニングには、大規模な敵対的摂動が必要であり、これには時間がかかります。この問題に対処するために、c-Glowとエネルギーベースのモデルとの間のKL発散を最小化することにより、c-Glowを効率的に学習することを革新的に提案します。これにより、敵対的摂動だけでなく、ランダムにサンプリングされた摂動に対して敵対的である確率を評価できます。本研究では、サロゲートモデルで上記の効率的なトレーニング方法で事前トレーニングされたc-Glowモデルに基づく新しい転送メカニズムを設計することにより、新しいスコアベースのブラックボックス敵対攻撃方法を提案し、敵対的転送可能性とターゲットモデルへのクエリ。広範な実験により、提案された方法は、攻撃の成功率とクエリ効率の両方において、いくつかの最先端のブラックボックス攻撃方法よりも優れていることが実証されています。

Adversarial examples against deep neural networks (DNNs) have been extensively developed in recent years. Modeling the distribution of adversarial perturbations could play an important role in generating adversarial perturbations, especially in the scenario of black-box adversarial attack. However, the adversarial distribution is rarely studied as far as we know. To this end, we propose to approximate the conditional distribution of adversarial perturbations given benign examples by the conditional generative flow model (c-Glow), which shows powerful ability of capturing the complex data distribution. However, the standard training of the c-Glow by maximum likelihood estimation requires massive adversarial perturbations, which is time-consuming. To address this problem, we innovatively propose to efficiently learn the c-Glow by minimizing the KL divergence between it and an energy-based model, which can evaluate the probability of being adversarial for any randomly sampled perturbation, rather than only adversarial perturbations. In this work, we propose a novel score-based black-box adversarial attack method by designing a novel transfer mechanism based on the c-Glow model pretrained with the above efficient training method on surrogate models, to take advantage of both the adversarial transferability and queries to the target model. Extensive experiments demonstrate that the proposed method is superior on both attack success rate and query efficiency to several state-of-the-art black-box attack methods.

updated: Wed Nov 18 2020 06:28:19 GMT+0000 (UTC)

published: Mon Jun 15 2020 16:45:27 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト