Momentum Contrast for Unsupervised Visual Representation Learning

Kaiming He; Haoqi Fan; Yuxin Wu; Saining Xie; Ross Girshick

教師なし視覚表現学習の運動量コントラスト

教師なし視覚表現学習のための運動量コントラスト（MoCo）を提示します。辞書検索としての対比学習の観点から、キューと移動平均エンコーダを備えた動的辞書を構築します。これにより、大規模で一貫性のある辞書をオンザフライで作成し、対照的な教師なし学習を促進できます。 MoCoは、ImageNet分類に関する一般的な線形プロトコルの下で競争力のある結果を提供します。さらに重要なことは、MoCoによって学習された表現がダウンストリームタスクにうまく移行することです。 MoCoは、PASCAL VOC、COCO、およびその他のデータセットでの7つの検出/セグメンテーションタスクで、監督下のトレーニング前の対応物よりも優れていることがあります。これは、多くの視覚タスクにおいて、教師なし表現学習と教師付き表現学習のギャップがほぼ閉じていることを示唆しています。

We present Momentum Contrast (MoCo) for unsupervised visual representation learning. From a perspective on contrastive learning as dictionary look-up, we build a dynamic dictionary with a queue and a moving-averaged encoder. This enables building a large and consistent dictionary on-the-fly that facilitates contrastive unsupervised learning. MoCo provides competitive results under the common linear protocol on ImageNet classification. More importantly, the representations learned by MoCo transfer well to downstream tasks. MoCo can outperform its supervised pre-training counterpart in 7 detection/segmentation tasks on PASCAL VOC, COCO, and other datasets, sometimes surpassing it by large margins. This suggests that the gap between unsupervised and supervised representation learning has been largely closed in many vision tasks.

updated: Mon Mar 23 2020 18:36:55 GMT+0000 (UTC)

published: Wed Nov 13 2019 18:53:26 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト