Multi-label Iterated Learning for Image Classification with Label Ambiguity

Sai Rajeswar; Pau Rodriguez; Soumye Singhal; David Vazquez; Aaron Courville

ラベルのあいまいさを伴う画像分類のためのマルチラベル反復学習

大規模な事前トレーニング済みモデルからの転移学習は、多くのコンピュータービジョンタスクに不可欠になっています。最近の研究では、複数のオブジェクトクラスが存在する画像には単一のラベルが割り当てられているため、ImageNetなどのデータセットには弱いラベルが付けられていることが示されています。このあいまいさにより、モデルは単一の予測に偏り、データ内で共起する傾向のあるクラスが抑制される可能性があります。言語創発文献に触発されて、反復学習のフレームワークを使用して、単一ラベルからのマルチラベル学習の誘導バイアスを組み込むためのマルチラベル反復学習（MILe）を提案します。 MILeは、学習のボトルネックを伴う教師と学生のネットワークの世代を超えてバイナリ予測を伝播することにより、画像のマルチラベル記述を構築するシンプルで効果的な手順です。実験によると、私たちのアプローチは、ImageNetの精度とReaL F1スコアに体系的な利点を示しています。これは、自己監視の重みから微調整した場合でも、MILeが標準のトレーニング手順よりもラベルのあいまいさをうまく処理できることを示しています。また、MILeがラベルノイズを効果的に低減し、WebVisionなどの実際の大規模なノイズの多いデータで最先端のパフォーマンスを実現することも示します。さらに、MILeは、IIRCなどのクラス増分設定のパフォーマンスを向上させ、分散シフトに対して堅牢です。コード：https：//github.com/rajeswar18/MILe

Transfer learning from large-scale pre-trained models has become essential for many computer vision tasks. Recent studies have shown that datasets like ImageNet are weakly labeled since images with multiple object classes present are assigned a single label. This ambiguity biases models towards a single prediction, which could result in the suppression of classes that tend to co-occur in the data. Inspired by language emergence literature, we propose multi-label iterated learning (MILe) to incorporate the inductive biases of multi-label learning from single labels using the framework of iterated learning. MILe is a simple yet effective procedure that builds a multi-label description of the image by propagating binary predictions through successive generations of teacher and student networks with a learning bottleneck. Experiments show that our approach exhibits systematic benefits on ImageNet accuracy as well as ReaL F1 score, which indicates that MILe deals better with label ambiguity than the standard training procedure, even when fine-tuning from self-supervised weights. We also show that MILe is effective reducing label noise, achieving state-of-the-art performance on real-world large-scale noisy data such as WebVision. Furthermore, MILe improves performance in class incremental settings such as IIRC and it is robust to distribution shifts. Code: https://github.com/rajeswar18/MILe

updated: Tue Nov 23 2021 22:10:00 GMT+0000 (UTC)

published: Tue Nov 23 2021 22:10:00 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト