ActiveMatch: End-to-end Semi-supervised Active Representation Learning

Xinkai Yuan; Zilinghan Li; Gaoang Wang

ActiveMatch: エンドツーエンドの半教師ありアクティブ表現学習

半教師あり学習 (SSL) は、ラベル付けされたデータとラベル付けされていないデータの両方でモデルをトレーニングできる効率的なフレームワークですが、適切なラベル付けされたサンプルが不足している場合、あいまいで区別できない表現を生成する可能性があります。ヒューマンインザループを使用すると、アクティブラーニングは、SSL フレームワークでのパフォーマンスを向上させるためのラベル付けとトレーニングのために、有益なラベル付けされていないサンプルを繰り返し選択できます。ただし、ほとんどの既存のアクティブラーニングアプローチは事前トレーニング済みの機能に依存しており、エンドツーエンドの学習には適していません。 SSL の欠点に対処するために、このホワイトペーパーでは、新しいエンドツーエンドの表現学習方法、つまり ActiveMatch を提案します。これは、SSL を対照学習およびアクティブ学習と組み合わせて、限られたラベルを最大限に活用します。 ActiveMatch は、ウォームアップとして教師なし対照学習を使用した少量のラベル付きデータから始めて、SSL と教師あり対照学習を組み合わせ、トレーニング中にラベル付けする最も代表的なサンプルを積極的に選択することで、分類に対するより良い表現をもたらします。同量のラベル付きデータを使用した MixMatch および FixMatch と比較すると、ActiveMatch は最先端のパフォーマンスを達成し、100 個のラベルを収集した CIFAR-10 で 89.24% の精度、200 個の収集したラベルで 92.20% の精度を達成したことを示しています。

Semi-supervised learning (SSL) is an efficient framework that can train models with both labeled and unlabeled data, but may generate ambiguous and non-distinguishable representations when lacking adequate labeled samples. With human-in-the-loop, active learning can iteratively select informative unlabeled samples for labeling and training to improve the performance in the SSL framework. However, most existing active learning approaches rely on pre-trained features, which is not suitable for end-to-end learning. To deal with the drawbacks of SSL, in this paper, we propose a novel end-to-end representation learning method, namely ActiveMatch, which combines SSL with contrastive learning and active learning to fully leverage the limited labels. Starting from a small amount of labeled data with unsupervised contrastive learning as a warm-up, ActiveMatch then combines SSL and supervised contrastive learning, and actively selects the most representative samples for labeling during the training, resulting in better representations towards the classification. Compared with MixMatch and FixMatch with the same amount of labeled data, we show that ActiveMatch achieves the state-of-the-art performance, with 89.24% accuracy on CIFAR-10 with 100 collected labels, and 92.20% accuracy with 200 collected labels.

updated: Fri Aug 05 2022 04:54:16 GMT+0000 (UTC)

published: Wed Oct 06 2021 06:07:40 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト