Improving Black-box Adversarial Attacks with a Transfer-based Prior

Shuyu Cheng; Yinpeng Dong; Tianyu Pang; Hang Su; Jun Zhu

転送ベースの事前確率を使用したブラックボックス攻撃の改善

ブラックボックスの敵対的設定を検討します。敵対者は、勾配を計算するためにターゲットモデルにアクセスせずに敵対的摂動を生成する必要があります。以前の方法では、サロゲートホワイトボックスモデルの転送勾配を使用するか、クエリフィードバックに基づいて、勾配を近似しようとしました。ただし、これらの方法は、限られた情報で高次元空間の勾配を推定することは簡単ではないため、攻撃の成功率が低いか、クエリの効率が悪いことがよくあります。これらの問題に対処するために、転送ベースの事前情報とクエリ情報を同時に利用するブラックボックス攻撃を改善するための事前誘導ランダム勾配フリー（P-RGF）メソッドを提案します。代理モデルの勾配によって与えられる転送ベースの事前分布は、理論分析によって導き出された最適な係数によって、アルゴリズムに適切に統合されます。広範な実験により、この手法は、代替の最先端の手法に比べて成功率が高いブラックボックスモデルを攻撃するために必要なクエリが非常に少ないことが実証されています。

We consider the black-box adversarial setting, where the adversary has to generate adversarial perturbations without access to the target models to compute gradients. Previous methods tried to approximate the gradient either by using a transfer gradient of a surrogate white-box model, or based on the query feedback. However, these methods often suffer from low attack success rates or poor query efficiency since it is non-trivial to estimate the gradient in a high-dimensional space with limited information. To address these problems, we propose a prior-guided random gradient-free (P-RGF) method to improve black-box adversarial attacks, which takes the advantage of a transfer-based prior and the query information simultaneously. The transfer-based prior given by the gradient of a surrogate model is appropriately integrated into our algorithm by an optimal coefficient derived by a theoretical analysis. Extensive experiments demonstrate that our method requires much fewer queries to attack black-box models with higher success rates compared with the alternative state-of-the-art methods.

updated: Sun Jul 26 2020 14:00:51 GMT+0000 (UTC)

published: Mon Jun 17 2019 09:40:32 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト