Un-Mix: Rethinking Image Mixtures for Unsupervised Visual Representation Learning

Zhiqiang Shen; Zechun Liu; Zhuang Liu; Marios Savvides; Trevor Darrell; Eric Xing

Un-Mix：教師なし視覚表現学習のための画像混合の再考

最近の高度な教師なし学習アプローチでは、シャムのようなフレームワークを使用して、表現を学習するために同じ画像からの2つの「ビュー」を比較します。 2つのビューを区別することは、教師なしメソッドが意味のある情報を学習できることを保証するためのコアです。ただし、このようなフレームワークは、2つのビューを生成するために使用される拡張が十分に強力でない場合、過剰適合に対して脆弱であることがあり、トレーニングデータに自信過剰の問題を引き起こします。この欠点により、モデルは微妙な分散やきめ細かい情報を学習できなくなります。これに対処するために、この作業では、教師なし学習にラベルスペースの距離の概念を組み込み、入力データスペースを混合することで、モデルに正または負のペア間のソフトな類似度を認識させ、さらに協力して作業することを目指します。入力スペースと損失スペース。概念は単純ですが、教師なし画像混合（Un-Mix）のソリューションを使用すると、変換された入力と対応する新しいラベルスペースから、より繊細で堅牢で一般化された表現を学習できることを経験的に示します。 CIFAR-10、CIFAR-100、STL-10、Tiny ImageNet、および一般的な教師なし手法SimCLR、BYOL、MoCo V1＆V2、SwAVなどを使用した標準ImageNetで広範な実験が行われます。提案された画像混合およびラベル割り当て戦略は、基本メソッドとまったく同じハイパーパラメータとトレーニング手順に従って、1〜3％。コードはhttps://github.com/szq0214/Un-Mixで公開されています。

The recently advanced unsupervised learning approaches use the siamese-like framework to compare two "views" from the same image for learning representations. Making the two views distinctive is a core to guarantee that unsupervised methods can learn meaningful information. However, such frameworks are sometimes fragile on overfitting if the augmentations used for generating two views are not strong enough, causing the over-confident issue on the training data. This drawback hinders the model from learning subtle variance and fine-grained information. To address this, in this work we aim to involve the distance concept on label space in the unsupervised learning and let the model be aware of the soft degree of similarity between positive or negative pairs through mixing the input data space, to further work collaboratively for the input and loss spaces. Despite its conceptual simplicity, we show empirically that with the solution -- Unsupervised image mixtures (Un-Mix), we can learn subtler, more robust and generalized representations from the transformed input and corresponding new label space. Extensive experiments are conducted on CIFAR-10, CIFAR-100, STL-10, Tiny ImageNet and standard ImageNet with popular unsupervised methods SimCLR, BYOL, MoCo V1&V2, SwAV, etc. Our proposed image mixture and label assignment strategy can obtain consistent improvement by 1~3% following exactly the same hyperparameters and training procedures of the base methods. Code is publicly available at https://github.com/szq0214/Un-Mix.

updated: Fri Dec 17 2021 18:42:21 GMT+0000 (UTC)

published: Wed Mar 11 2020 17:59:04 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト