Rethinking Prototypical Contrastive Learning through Alignment, Uniformity and Correlation

Shentong Mo; Zhun Sun; Chao Li

整合性、均一性、相関性による典型的な対照学習の再考

強力なセマンティック情報を必要とするダウンストリームタスクの意味のある表現を学習するために、プロトタイプの正則化を使用した対照的な自己教師あり学習 (CSL) が導入されました。ただし、プロトタイプの正則化を積極的に実行する損失 (たとえば、ProtoNCE 損失) を使用して CSL を最適化すると、埋め込み空間で例の「凝固」が発生する可能性があります。つまり、サンプルのプロトタイプ内の多様性は、プロトタイプが他のものから十分に分離されているための些細な解決策に崩壊します。以前の研究に動機付けられて、整列、均一性、および相関 (PAUC) を通じてプロトタイプ表現を学習することにより、この現象を軽減することを提案します。具体的には、通常の ProtoNCE 損失は次のように修正されます。 (2) 原型レベルの特徴を均一に分散させる均一性損失。（3）原型レベルの特徴間の多様性と識別可能性を高める相関損失。さまざまなベンチマークで大規模な実験を行い、その結果は、原型的な対照表現の品質を改善する方法の有効性を示しています。特に、線形プローブを使用した下流の分類タスクでは、提案された方法は、ImageNet-100 データセットで 2.96%、ImageNet-1K データセットで最先端のインスタンスごとのプロトタイプの対照学習方法よりも優れています。 % バッチサイズとエポックの同じ設定の下で。

Contrastive self-supervised learning (CSL) with a prototypical regularization has been introduced in learning meaningful representations for downstream tasks that require strong semantic information. However, to optimize CSL with a loss that performs the prototypical regularization aggressively, e.g., the ProtoNCE loss, might cause the "coagulation" of examples in the embedding space. That is, the intra-prototype diversity of samples collapses to trivial solutions for their prototype being well-separated from others. Motivated by previous works, we propose to mitigate this phenomenon by learning Prototypical representation through Alignment, Uniformity and Correlation (PAUC). Specifically, the ordinary ProtoNCE loss is revised with: (1) an alignment loss that pulls embeddings from positive prototypes together; (2) a uniformity loss that distributes the prototypical level features uniformly; (3) a correlation loss that increases the diversity and discriminability between prototypical level features. We conduct extensive experiments on various benchmarks where the results demonstrate the effectiveness of our method in improving the quality of prototypical contrastive representations. Particularly, in the classification down-stream tasks with linear probes, our proposed method outperforms the state-of-the-art instance-wise and prototypical contrastive learning methods on the ImageNet-100 dataset by 2.96% and the ImageNet-1K dataset by 2.46% under the same settings of batch size and epochs.

updated: Tue Oct 18 2022 22:33:12 GMT+0000 (UTC)

published: Tue Oct 18 2022 22:33:12 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト