Generalizing and Decoupling Neural Collapse via Hyperspherical Uniformity Gap

Weiyang Liu; Longhui Yu; Adrian Weller; Bernhard Schölkopf

超球面均一性ギャップによる神経崩壊の一般化と分離

ニューラルコラプス (NC) 現象は、ディープニューラルネットワークの根底にある幾何学的な対称性を表しています。そこでは、深く学習された特徴と分類器の両方がシンプレックス等角タイトフレームに収束します。交差エントロピー損失と平均二乗誤差の両方が NC につながる可能性があることが示されています。特徴次元とクラス数に関する NC の重要な仮定を取り除き、元の NC を効果的に包含する一般化神経崩壊 (GNC) 仮説を提示します。 NC がニューラルネットワークのトレーニングターゲットを特徴付ける方法に着想を得て、GNC を 2 つの目的に分離します。最小のクラス内変動性と最大のクラス間分離性です。次に、これらの 2 つの目的を定量化するための統一されたフレームワークとして、超球面の均一性 (単位超球上の均一性の程度を特徴付ける) を使用します。最後に、一般的な目的である超球面均一性ギャップ (HUG) を提案します。これは、クラス間およびクラス内の超球面均一性の差によって定義されます。 HUG は GNC に確実に収束するだけでなく、GNC を 2 つの別個の目的に分離します。クラス内のコンパクト性とクラス間の分離可能性を結合するクロスエントロピー損失とは異なり、HUG は柔軟性が高く、優れた代替損失関数として機能します。実験結果は、HUG が一般化とロバスト性の点でうまく機能することを示しています。

The neural collapse (NC) phenomenon describes an underlying geometric symmetry for deep neural networks, where both deeply learned features and classifiers converge to a simplex equiangular tight frame. It has been shown that both cross-entropy loss and mean square error can provably lead to NC. We remove NC's key assumption on the feature dimension and the number of classes, and then present a generalized neural collapse (GNC) hypothesis that effectively subsumes the original NC. Inspired by how NC characterizes the training target of neural networks, we decouple GNC into two objectives: minimal intra-class variability and maximal inter-class separability. We then use hyperspherical uniformity (which characterizes the degree of uniformity on the unit hypersphere) as a unified framework to quantify these two objectives. Finally, we propose a general objective -- hyperspherical uniformity gap (HUG), which is defined by the difference between inter-class and intra-class hyperspherical uniformity. HUG not only provably converges to GNC, but also decouples GNC into two separate objectives. Unlike cross-entropy loss that couples intra-class compactness and inter-class separability, HUG enjoys more flexibility and serves as a good alternative loss function. Empirical results show that HUG works well in terms of generalization and robustness.

updated: Sat Mar 11 2023 19:33:24 GMT+0000 (UTC)

published: Sat Mar 11 2023 19:33:24 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト