Self-supervised Pretraining of Visual Features in the Wild

Priya Goyal; Mathilde Caron; Benjamin Lefaudeux; Min Xu; Pengchao Wang; Vivek Pai; Mannat Singh; Vitaliy Liptchinsky; Ishan Misra; Armand Joulin; Piotr Bojanowski

野生の視覚的特徴の自己教師あり事前トレーニング

最近、MoCo、SimCLR、BYOL、SwAVなどの教師あり学習方法により、教師あり方法とのギャップが縮小されました。これらの結果は、高度にキュレーションされたImageNetデータセットである制御環境で達成されました。ただし、自己教師あり学習の前提は、任意のランダム画像および任意の無制限のデータセットから学習できることです。この作業では、監視なしでランダムなキュレートされていない画像で大きなモデルをトレーニングすることにより、自己監視が期待どおりに機能するかどうかを調査します。最終的な自己教師あり（SEER）モデルである、512 GPUを備えた1Bランダム画像でトレーニングされた1.3Bパラメーターを備えたRegNetYは、84.2％のトップ1精度を達成し、最高の自己教師あり事前トレーニングモデルを1％超え、自己教師あり学習を確認します。実世界の設定で動作します。興味深いことに、自己教師ありモデルは、ImageNetのわずか10％にアクセスして、77.9％のトップ1を達成する優れた少数の学習者であることがわかります。コード：https：//github.com/facebookresearch/vissl

Recently, self-supervised learning methods like MoCo, SimCLR, BYOL and SwAV have reduced the gap with supervised methods. These results have been achieved in a control environment, that is the highly curated ImageNet dataset. However, the premise of self-supervised learning is that it can learn from any random image and from any unbounded dataset. In this work, we explore if self-supervision lives to its expectation by training large models on random, uncurated images with no supervision. Our final SElf-supERvised (SEER) model, a RegNetY with 1.3B parameters trained on 1B random images with 512 GPUs achieves 84.2% top-1 accuracy, surpassing the best self-supervised pretrained model by 1% and confirming that self-supervised learning works in a real world setting. Interestingly, we also observe that self-supervised models are good few-shot learners achieving 77.9% top-1 with access to only 10% of ImageNet. Code: https://github.com/facebookresearch/vissl

updated: Tue Mar 02 2021 19:12:29 GMT+0000 (UTC)

published: Tue Mar 02 2021 19:12:29 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト