Efficient Visual Pretraining with Contrastive Detection

Olivier J. Hénaff; Skanda Koppula; Jean-Baptiste Alayrac; Aaron van den Oord; Oriol Vinyals; João Carreira

対照的な検出による効率的な視覚事前トレーニング

自己教師あり事前トレーニングは、転移学習のための強力な表現を生み出すことが示されています。ただし、これらのパフォーマンスの向上には大きな計算コストがかかりますが、最新の方法では、監視された事前トレーニングよりも1桁多くの計算が必要になります。この計算上のボトルネックに対処するために、新しい自己教師ありの客観的で対照的な検出を導入します。これは、拡張全体でオブジェクトレベルの機能を識別することで表現を処理します。この目的は、画像ごとに豊富な学習信号を抽出し、さまざまなダウンストリームタスクで最先端の転送精度を実現すると同時に、事前トレーニングを最大10分の1に削減します。特に、当社の最強のImageNet事前トレーニングモデルは、これまでで最大の自己監視システムの1つであるSEERと同等のパフォーマンスを発揮し、1000倍以上の事前トレーニングデータを使用します。最後に、私たちの目標は、COCOのようなより複雑な画像の事前トレーニングをシームレスに処理し、COCOからPASCALへの監視された転送学習とのギャップを埋めます。

Self-supervised pretraining has been shown to yield powerful representations for transfer learning. These performance gains come at a large computational cost however, with state-of-the-art methods requiring an order of magnitude more computation than supervised pretraining. We tackle this computational bottleneck by introducing a new self-supervised objective, contrastive detection, which tasks representations with identifying object-level features across augmentations. This objective extracts a rich learning signal per image, leading to state-of-the-art transfer accuracy on a variety of downstream tasks, while requiring up to 10x less pretraining. In particular, our strongest ImageNet-pretrained model performs on par with SEER, one of the largest self-supervised systems to date, which uses 1000x more pretraining data. Finally, our objective seamlessly handles pretraining on more complex images such as those in COCO, closing the gap with supervised transfer learning from COCO to PASCAL.

updated: Thu Aug 05 2021 15:51:15 GMT+0000 (UTC)

published: Fri Mar 19 2021 14:05:12 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト