Adversarial Attack Generation Empowered by Min-Max Optimization

Jingkang Wang; Tianyun Zhang; Sijia Liu; Pin-Yu Chen; Jiacen Xu; Makan Fardad; Bo Li

ミニマックス最適化によって強化された敵対的攻撃の生成

最大の敵対的損失を最小化する最悪の場合の訓練原理は、敵対的訓練（AT）としても知られ、敵対的頑健性を強化するための最先端のアプローチであることが示されています。それにもかかわらず、ATの目的を超えた最小-最大最適化は、敵対的な状況で厳密に調査されていません。このホワイトペーパーでは、複数のドメインにわたる最小-最大最適化の一般的なフレームワークを活用して、さまざまなタイプの敵対的攻撃の設計を進める方法を示します。特に、一連のリスクソースが与えられた場合、ドメインセットの確率シンプレックスに対して最大化されるドメインの重みを導入することにより、最悪の場合の攻撃損失を最小化することを最小-最大問題として再定式化できます。この統合されたフレームワークを、攻撃生成の3つの問題（モデルアンサンブルの攻撃、複数の入力の下での普遍的な摂動の考案、データ変換に耐性のある攻撃の作成）で紹介します。広範な実験は、私たちのアプローチが、既存のヒューリスティック戦略に対する大幅な攻撃の改善と、複数の摂動タイプに対してロバストになるようにトレーニングされた最先端の防御方法に対するロバスト性の改善につながることを示しています。さらに、min-maxフレームワークから学習した自己調整ドメインの重みは、ドメイン間の攻撃の難易度を説明するための包括的なツールを提供できることがわかりました。コードはhttps://github.com/wangjksjtu/minmax-advで入手できます。

The worst-case training principle that minimizes the maximal adversarial loss, also known as adversarial training (AT), has shown to be a state-of-the-art approach for enhancing adversarial robustness. Nevertheless, min-max optimization beyond the purpose of AT has not been rigorously explored in the adversarial context. In this paper, we show how a general framework of min-max optimization over multiple domains can be leveraged to advance the design of different types of adversarial attacks. In particular, given a set of risk sources, minimizing the worst-case attack loss can be reformulated as a min-max problem by introducing domain weights that are maximized over the probability simplex of the domain set. We showcase this unified framework in three attack generation problems -- attacking model ensembles, devising universal perturbation under multiple inputs, and crafting attacks resilient to data transformations. Extensive experiments demonstrate that our approach leads to substantial attack improvement over the existing heuristic strategies as well as robustness improvement over state-of-the-art defense methods trained to be robust against multiple perturbation types. Furthermore, we find that the self-adjusted domain weights learned from our min-max framework can provide a holistic tool to explain the difficulty level of attack across domains. Code is available at https://github.com/wangjksjtu/minmax-adv.

updated: Mon Nov 01 2021 17:21:53 GMT+0000 (UTC)

published: Sun Jun 09 2019 04:32:13 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト