Solving Inefficiency of Self-supervised Representation Learning

Guangrun Wang; Keze Wang; Guangcong Wang; Phillip H. S. Torr; Liang Lin

自己教師あり表現学習の非効率性の解決

教師あり学習は、教師なしの方法で識別表現を学習するという大きな可能性があるため、大きな関心を集めています。この方向に沿って、対照学習は現在の最先端のパフォーマンスを実現します。認められた成功にもかかわらず、既存の対照的な学習方法は、非常に低い学習効率に悩まされています。この論文では、対照学習において、学習効率の主要な障害である、アンダークラスタリングとオーバークラスタリングの問題と呼ばれる2つの相反する現象を発見します。アンダークラスタリングとは、対照学習の負のサンプルペアが実際のすべてのオブジェクトカテゴリを区別するには不十分な場合、モデルがクラス間サンプル間の非類似性を発見することを効率的に学習できないことを意味します。オーバークラスタリングは、モデルが過剰な負のサンプルペアから特徴表現を効率的に学習できないことを意味します。これには、多くの外れ値が含まれるため、同じ実際のカテゴリのサンプルを異なるクラスターにオーバークラスターするようにモデルを強制します。これらの2つの問題を同時に克服するために、中央値のトリプレット損失を使用した新しい自己教師あり学習フレームワークを提案します。正確には、クラスタリング不足の問題に対処するために、正のペアと負のペアの間の相対距離を最大化する傾向がある三重項損失を採用しています。そして、ベルヌーイ分布モデルによって保証されているオーバークラスタリングの問題を回避するために、すべての負のサンプルから中央値の類似性スコアの負のサンプルを選択することによって、負のペアを構築します。提案されたフレームワークを、いくつかの大規模なベンチマーク（ImageNet、SYSU-30k、COCOなど）で広範囲に評価します。結果は、最新の最先端の方法よりも明確なマージンでモデルの優れたパフォーマンスを示しています。

Self-supervised learning has attracted great interest due to its tremendous potentials in learning discriminative representations in an unsupervised manner. Along this direction, contrastive learning achieves current state-of-the-art performance. Despite the acknowledged successes, existing contrastive learning methods suffer from very low learning efficiency, e.g., taking about ten times more training epochs than supervised learning for comparable recognition accuracy. In this paper, we discover two contradictory phenomena in contrastive learning that we call under-clustering and over-clustering problems, which are major obstacles to learning efficiency. Under-clustering means that the model cannot efficiently learn to discover the dissimilarity between inter-class samples when the negative sample pairs for contrastive learning are insufficient to differentiate all the actual object categories. Over-clustering implies that the model cannot efficiently learn the feature representation from excessive negative sample pairs, which include many outliers and thus enforce the model to over-cluster samples of the same actual categories into different clusters. To simultaneously overcome these two problems, we propose a novel self-supervised learning framework using a median triplet loss. Precisely, we employ a triplet loss tending to maximize the relative distance between the positive pair and negative pairs to address the under-clustering problem; and we construct the negative pair by selecting the negative sample of a median similarity score from all negative samples to avoid the over-clustering problem, guaranteed by the Bernoulli Distribution model. We extensively evaluate our proposed framework in several large-scale benchmarks (e.g., ImageNet, SYSU-30k, and COCO). The results demonstrate the superior performance of our model over the latest state-of-the-art methods by a clear margin.

updated: Sun Apr 18 2021 07:47:10 GMT+0000 (UTC)

published: Sun Apr 18 2021 07:47:10 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト