Solving Inefficiency of Self-supervised Representation Learning

Guangrun Wang; Keze Wang; Guangcong Wang; Philip H. S. Torr; Liang Lin

自己監視表現学習の非効率性の解決

教師なし学習（特に対照学習）は、教師なしで識別表現を学習する可能性が非常に高いため、大きな関心を集めています。認められた成功にもかかわらず、既存の対照的な学習方法は、学習効率が非常に低いという問題があります。たとえば、同等の認識精度を得るために、教師あり学習の約10倍のトレーニングエポックが必要です。この論文では、対照学習における2つの相反する現象を明らかにします。これは、学習効率の主要な障害である、アンダークラスタリングとオーバークラスタリングの問題と呼ばれます。アンダークラスタリングとは、対照学習の負のサンプルペアが実際のすべてのオブジェクトクラスを区別するには不十分な場合、モデルがクラス間サンプル間の非類似性を発見することを効率的に学習できないことを意味します。オーバークラスタリングは、モデルが過剰な負のサンプルペアから特徴を効率的に学習できないことを意味し、モデルは同じ実際のクラスのサンプルを異なるクラスターにオーバークラスターすることを余儀なくされます。これらの2つの問題を同時に克服するために、切り捨てられたトリプレット損失を使用した新しい自己監視学習フレームワークを提案します。正確には、クラスタリング不足の問題に対処するために、正のペアと負のペアの間の相対距離を最大化する傾向がある三重項損失を採用しています。そして、ベルヌーイ分布モデルによって保証されているオーバークラスタリングの問題を回避するために、すべての負のサンプルから負のサンプルの代理を選択することによって、負のペアを構築します。いくつかの大規模なベンチマーク（ImageNet、SYSU-30k、COCOなど）でフレームワークを広範囲に評価します。結果は、最新の最先端の方法に対するモデルの優位性（学習効率など）を明確なマージンで示しています。コードはhttps://github.com/wanggrun/tripletで入手できます。

Self-supervised learning (especially contrastive learning) has attracted great interest due to its huge potential in learning discriminative representations in an unsupervised manner. Despite the acknowledged successes, existing contrastive learning methods suffer from very low learning efficiency, e.g., taking about ten times more training epochs than supervised learning for comparable recognition accuracy. In this paper, we reveal two contradictory phenomena in contrastive learning that we call under-clustering and over-clustering problems, which are major obstacles to learning efficiency. Under-clustering means that the model cannot efficiently learn to discover the dissimilarity between inter-class samples when the negative sample pairs for contrastive learning are insufficient to differentiate all the actual object classes. Over-clustering implies that the model cannot efficiently learn features from excessive negative sample pairs, forcing the model to over-cluster samples of the same actual classes into different clusters. To simultaneously overcome these two problems, we propose a novel self-supervised learning framework using a truncated triplet loss. Precisely, we employ a triplet loss tending to maximize the relative distance between the positive pair and negative pairs to address the under-clustering problem; and we construct the negative pair by selecting a negative sample deputy from all negative samples to avoid the over-clustering problem, guaranteed by the Bernoulli Distribution model. We extensively evaluate our framework in several large-scale benchmarks (e.g., ImageNet, SYSU-30k, and COCO). The results demonstrate our model's superiority (e.g., the learning efficiency) over the latest state-of-the-art methods by a clear margin. Codes available at: https://github.com/wanggrun/triplet .

updated: Thu Oct 21 2021 10:19:10 GMT+0000 (UTC)

published: Sun Apr 18 2021 07:47:10 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト