Understanding Robust Learning through the Lens of Representation Similarities

Christian Cianfarani; Arjun Nitin Bhagoji; Vikash Sehwag; Ben Zhao; Prateek Mittal

表現の類似性のレンズを通してロバストな学習を理解する

表現学習、つまりダウンストリームアプリケーションに役立つ表現の生成は、ディープニューラルネットワーク（DNN）の成功の多くの根底にある基本的に重要なタスクです。最近、敵対的な例に対するロバスト性がDNNの望ましい特性として浮上し、敵対的な例を説明するロバストなトレーニング方法の開発に拍車をかけています。この論文では、ロバストなトレーニングによって学習された表現の特性が、標準的な非ロバストなトレーニングから得られたものとどのように異なるかを理解することを目的としています。これは、良性の入力でのパフォーマンスの低下、堅牢性の一般化の不十分さ、過剰適合の増加など、堅牢なネットワークにおける多くの顕著な落とし穴を診断するために重要です。 3つのビジョンデータセットにわたって、表現類似性メトリックと呼ばれる強力なツールセットを利用して、アーキテクチャ、トレーニング手順、および敵対的な制約が異なる堅牢なDNNと堅牢でないDNNをレイヤーごとに比較します。私たちの実験は、ロバストなネットワークの振る舞いの違いの根底にある、ロバストな表現のこれまでに見られなかった特性を強調しています。「ブロック構造」の消失とともに、堅牢なネットワークの表現に特化していないことを発見しました。また、堅牢なトレーニング中の過剰適合は、より深い層に大きな影響を与えることもわかりました。これらは、他の調査結果とともに、より堅牢なネットワークの設計とトレーニングを進める方法を示唆しています。

Representation learning, i.e. the generation of representations useful for downstream applications, is a task of fundamental importance that underlies much of the success of deep neural networks (DNNs). Recently, robustness to adversarial examples has emerged as a desirable property for DNNs, spurring the development of robust training methods that account for adversarial examples. In this paper, we aim to understand how the properties of representations learned by robust training differ from those obtained from standard, non-robust training. This is critical to diagnosing numerous salient pitfalls in robust networks, such as, degradation of performance on benign inputs, poor generalization of robustness, and increase in over-fitting. We utilize a powerful set of tools known as representation similarity metrics, across three vision datasets, to obtain layer-wise comparisons between robust and non-robust DNNs with different architectures, training procedures and adversarial constraints. Our experiments highlight hitherto unseen properties of robust representations that we posit underlie the behavioral differences of robust networks. We discover a lack of specialization in robust networks' representations along with a disappearance of `block structure'. We also find overfitting during robust training largely impacts deeper layers. These, along with other findings, suggest ways forward for the design and training of better robust networks.

updated: Mon Jun 20 2022 16:06:20 GMT+0000 (UTC)

published: Mon Jun 20 2022 16:06:20 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト