Understanding Robust Learning through the Lens of Representation Similarities

Christian Cianfarani; Arjun Nitin Bhagoji; Vikash Sehwag; Ben Y. Zhao; Prateek Mittal; Haitao Zheng

表現の類似性のレンズを通してロバストラーニングを理解する

表現学習、つまりダウンストリームアプリケーションに役立つ表現の生成は、ディープニューラルネットワーク (DNN) の成功の多くの基礎となる基本的に重要なタスクです。最近、敵対的な例に対する堅牢性が DNN の望ましい特性として浮上しており、敵対的な例を説明する堅牢なトレーニング方法の開発に拍車をかけています。この論文では、ロバストトレーニングによって学習された表現の特性が、標準の非ロバストトレーニングから得られたものとどのように異なるかを理解することを目的としています。これは、無害な入力に対するパフォーマンスの低下、ロバスト性の貧弱な一般化、オーバーフィッティングの増加など、ロバストなネットワークにおける多数の顕著な落とし穴を診断するために重要です。 3 つのビジョンデータセットにわたって、表現類似性メトリックとして知られる強力なツールセットを利用して、さまざまなトレーニング手順、アーキテクチャパラメータ、および敵対的制約を使用して、堅牢な DNN と堅牢でない DNN 間のレイヤーごとの比較を取得します。私たちの実験は、堅牢なネットワークの動作の違いの根底にあると仮定する堅牢な表現のこれまでにない特性を強調しています。「ブロック構造」の消失とともに、ロバストなネットワークの表現における特殊化の欠如を発見しました。また、ロバストトレーニング中のオーバーフィッティングは、より深い層に大きな影響を与えることもわかりました。これらは、他の調査結果とともに、より優れた堅牢なネットワークの設計とトレーニングを進める方法を示唆しています。

Representation learning, i.e. the generation of representations useful for downstream applications, is a task of fundamental importance that underlies much of the success of deep neural networks (DNNs). Recently, robustness to adversarial examples has emerged as a desirable property for DNNs, spurring the development of robust training methods that account for adversarial examples. In this paper, we aim to understand how the properties of representations learned by robust training differ from those obtained from standard, non-robust training. This is critical to diagnosing numerous salient pitfalls in robust networks, such as, degradation of performance on benign inputs, poor generalization of robustness, and increase in over-fitting. We utilize a powerful set of tools known as representation similarity metrics, across three vision datasets, to obtain layer-wise comparisons between robust and non-robust DNNs with different training procedures, architectural parameters and adversarial constraints. Our experiments highlight hitherto unseen properties of robust representations that we posit underlie the behavioral differences of robust networks. We discover a lack of specialization in robust networks' representations along with a disappearance of `block structure'. We also find overfitting during robust training largely impacts deeper layers. These, along with other findings, suggest ways forward for the design and training of better robust networks.

updated: Thu Sep 15 2022 17:47:42 GMT+0000 (UTC)

published: Mon Jun 20 2022 16:06:20 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト