ABC: Auxiliary Balanced Classifier for Class-imbalanced Semi-supervised Learning

Hyuck Lee; Seungjae Shin; Heeyoung Kim

ABC：クラス不均衡半教師あり学習のための補助バランス分類器

多くの実世界のデータセットのクラス分布は不均衡ですが、既存の半教師あり学習（SSL）アルゴリズムは通常、クラスバランスのとれたデータセットを想定しています。一般に、クラスの不均衡なデータセットでトレーニングされた分類器は、多数派のクラスに偏っています。この問題は、SSLアルゴリズムがトレーニングにラベルなしデータの偏った予測を利用するため、より問題になります。ただし、ラベル付きデータ用に設計された従来のクラス不均衡学習手法は、SSLアルゴリズムと簡単に組み合わせることができません。既存のSSLアルゴリズムの表現層に接続された単一層の補助平衡分類器（ABC）を導入することにより、クラスの不均衡を軽減しながら、ラベルなしデータを効果的に使用できるスケーラブルなクラス不平衡SSLアルゴリズムを提案します。 ABCは、ミニバッチのクラスバランスの取れた損失でトレーニングされ、過剰適合と情報損失を回避するためにバックボーンSSLアルゴリズムを使用して、ミニバッチ内のすべてのデータポイントから学習された高品質の表現を使用します。さらに、最近のSSLである整合性の正則化を使用します。ラベルなしデータを変更された方法で利用する手法。クラスごとに同じ確率でラベルなしデータを選択することにより、クラス間でバランスが取れるようにABCをトレーニングします。提案されたアルゴリズムは、4つのベンチマークデータセットを使用して、さまざまなクラス不均衡SSL実験で最先端のパフォーマンスを実現します。

Existing semi-supervised learning (SSL) algorithms typically assume class-balanced datasets, although the class distributions of many real-world datasets are imbalanced. In general, classifiers trained on a class-imbalanced dataset are biased toward the majority classes. This issue becomes more problematic for SSL algorithms because they utilize the biased prediction of unlabeled data for training. However, traditional class-imbalanced learning techniques, which are designed for labeled data, cannot be readily combined with SSL algorithms. We propose a scalable class-imbalanced SSL algorithm that can effectively use unlabeled data, while mitigating class imbalance by introducing an auxiliary balanced classifier (ABC) of a single layer, which is attached to a representation layer of an existing SSL algorithm. The ABC is trained with a class-balanced loss of a minibatch, while using high-quality representations learned from all data points in the minibatch using the backbone SSL algorithm to avoid overfitting and information loss.Moreover, we use consistency regularization, a recent SSL technique for utilizing unlabeled data in a modified way, to train the ABC to be balanced among the classes by selecting unlabeled data with the same probability for each class. The proposed algorithm achieves state-of-the-art performance in various class-imbalanced SSL experiments using four benchmark datasets.

updated: Wed Oct 20 2021 04:07:48 GMT+0000 (UTC)

published: Wed Oct 20 2021 04:07:48 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト