Attribute-Guided Adversarial Training for Robustness to Natural Perturbations

Tejas Gokhale; Rushil Anirudh; Bhavya Kailkhura; Jayaraman J. Thiagarajan; Chitta Baral; Yezhou Yang

自然摂動に対するロバスト性のための属性誘導敵対訓練

堅牢な深層学習の既存の作業は、小さなピクセルレベルのノルムベースの摂動に焦点を合わせていますが、これは、いくつかの実際の設定で遭遇する摂動を説明していない可能性があります。多くの場合、テストデータは入手できないかもしれませんが、摂動のタイプ（回転の程度が不明など）に関する幅広い仕様がわかっている場合があります。 iidではないがトレーニングドメインから逸脱している目に見えないテストドメインに対して堅牢性が期待されるセットアップを検討します。この偏差は正確にはわからない場合がありますが、その広範な特性は、属性の観点から事前に指定されています。テストドメインからのデータにアクセスすることなく、属性空間への分類器の露出を最大化するために、新しいサンプルを生成することを学習する敵対的なトレーニングアプローチを提案します。私たちの敵対的トレーニングは、内部最大化が敵対的摂動を生成し、外部最小化が内部最大化から生成された敵対的摂動の損失を最適化することによってモデルパラメータを見つけることで、最小最大最適化問題を解決します。オブジェクト関連のシフト、幾何学的変換、一般的な画像の破損という3種類の自然に発生する摂動に対するアプローチの適用可能性を示します。私たちのアプローチにより、ディープニューラルネットワークは、自然に発生するさまざまな摂動に対して堅牢になります。 MNIST、CIFAR-10、およびCLEVRデータセットの新しいバリアントに関する敵対的トレーニングを使用してトレーニングされたディープニューラルネットワークの堅牢性の向上を示すことにより、提案されたアプローチの有用性を示します。

While existing work in robust deep learning has focused on small pixel-level norm-based perturbations, this may not account for perturbations encountered in several real-world settings. In many such cases although test data might not be available, broad specifications about the types of perturbations (such as an unknown degree of rotation) may be known. We consider a setup where robustness is expected over an unseen test domain that is not i.i.d. but deviates from the training domain. While this deviation may not be exactly known, its broad characterization is specified a priori, in terms of attributes. We propose an adversarial training approach which learns to generate new samples so as to maximize exposure of the classifier to the attributes-space, without having access to the data from the test domain. Our adversarial training solves a min-max optimization problem, with the inner maximization generating adversarial perturbations, and the outer minimization finding model parameters by optimizing the loss on adversarial perturbations generated from the inner maximization. We demonstrate the applicability of our approach on three types of naturally occurring perturbations -- object-related shifts, geometric transformations, and common image corruptions. Our approach enables deep neural networks to be robust against a wide range of naturally occurring perturbations. We demonstrate the usefulness of the proposed approach by showing the robustness gains of deep neural networks trained using our adversarial training on MNIST, CIFAR-10, and a new variant of the CLEVR dataset.

updated: Thu Apr 08 2021 03:25:14 GMT+0000 (UTC)

published: Thu Dec 03 2020 10:17:30 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト