A Theory-Driven Self-Labeling Refinement Method for Contrastive Representation Learning

Pan Zhou; Caiming Xiong; Xiao-Tong Yuan; Steven Hoi

対照表現学習のための理論駆動型自己ラベリング改良法

画像クエリの場合、教師なし対照学習は、同じ画像のトリミングをポジティブとしてラベル付けし、他の画像のトリミングをネガティブとしてラベル付けします。直感的ではありますが、このようなネイティブラベル割り当て戦略では、クエリとそのポジティブおよびネガティブの間の根本的なセマンティック類似性を明らかにできず、パフォーマンスが低下します。これは、一部のネガティブがクエリとセマンティックに類似しているか、クエリと同じセマンティッククラスを共有しているためです。この作業では、対照的な学習の場合、不正確なラベル割り当てがセマンティックインスタンス識別の一般化を大きく損なう一方で、正確なラベルがその一般化に役立つことを最初に証明します。この理論に触発されて、対照的な学習のための新しい自己ラベル付けの洗練されたアプローチを提案します。 2つの補完的なモジュールを介してラベルの品質を向上させます。（i）正確なラベルを生成するための自己ラベル付け精製所（SLR）と（ii）クエリとそのポジティブの間の類似性を高めるためのモメンタムミックスアップ（MM）です。 SLRは、クエリのポジティブを使用して、クエリとそのポジティブおよびネガティブの間の意味的類似性を推定し、対照的な学習で推定された類似性とバニララベルの割り当てを組み合わせて、より正確で有益なソフトラベルを繰り返し生成します。理論的には、SLRがラベルが破損したデータの真のセマンティックラベルを正確に復元できることを示し、ネットワークを監視して分類タスクの予測エラーをゼロにします。 MMは、クエリとポジティブをランダムに組み合わせて、生成された仮想クエリとそれらのポジティブの間の意味的類似性を高め、ラベルの精度を向上させます。 CIFAR10、ImageNet、VOC、COCOの実験結果は、私たちの方法の有効性を示しています。 PyTorchのコードとモデルはオンラインでリリースされます。

For an image query, unsupervised contrastive learning labels crops of the same image as positives, and other image crops as negatives. Although intuitive, such a native label assignment strategy cannot reveal the underlying semantic similarity between a query and its positives and negatives, and impairs performance, since some negatives are semantically similar to the query or even share the same semantic class as the query. In this work, we first prove that for contrastive learning, inaccurate label assignment heavily impairs its generalization for semantic instance discrimination, while accurate labels benefit its generalization. Inspired by this theory, we propose a novel self-labeling refinement approach for contrastive learning. It improves the label quality via two complementary modules: (i) self-labeling refinery (SLR) to generate accurate labels and (ii) momentum mixup (MM) to enhance similarity between query and its positive. SLR uses a positive of a query to estimate semantic similarity between a query and its positive and negatives, and combines estimated similarity with vanilla label assignment in contrastive learning to iteratively generate more accurate and informative soft labels. We theoretically show that our SLR can exactly recover the true semantic labels of label-corrupted data, and supervises networks to achieve zero prediction error on classification tasks. MM randomly combines queries and positives to increase semantic similarity between the generated virtual queries and their positives so as to improves label accuracy. Experimental results on CIFAR10, ImageNet, VOC and COCO show the effectiveness of our method. PyTorch code and model will be released online.

updated: Mon Jun 28 2021 14:24:52 GMT+0000 (UTC)

published: Mon Jun 28 2021 14:24:52 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト