Self Supervision to Distillation for Long-Tailed Visual Recognition

Tianhao Li; Limin Wang; Gangshan Wu

ロングテール視覚認識のための蒸留への自己監視

ディープラーニングは、大規模なバランスの取れたデータセットでの視覚認識の目覚ましい進歩を達成しましたが、それでも実際のロングテールデータでは不十分です。以前の方法では、不均衡の問題を効果的に軽減するためにクラスのバランスを取り直したトレーニング戦略を採用することがよくありますが、テールクラスが過剰適合するリスクがある可能性があります。最近のデカップリング方法は、多段階のトレーニングスキームを使用することで過剰適合の問題を克服しますが、それでも、特徴学習段階でテールクラス情報をキャプチャすることはできません。この論文では、ソフトラベルが、ロングテール認識のための多段階トレーニングスキームにラベル相関を組み込むための強力なソリューションとして役立つ可能性があることを示します。ソフトラベルによって具体化されたクラス間の固有の関係は、知識をヘッドクラスからテールクラスに転送することにより、ロングテール認識に役立つことがわかります。具体的には、概念的にシンプルでありながら特に効果的な多段階トレーニングスキームを提案します。これは、蒸留の自己監視（SSD）と呼ばれます。このスキームは2つの部分で構成されています。まず、ラベル関係を自動的にマイニングできるロングテール認識のための自己蒸留フレームワークを紹介します。次に、自己監視によって導かれる新しい蒸留ラベル生成モジュールを紹介します。蒸留ラベルは、ロングテール分布を効果的にモデル化できるラベルドメインとデータドメインの両方からの情報を統合します。私たちは広範な実験を行い、私たちの方法は、ImageNet-LT、CIFAR100-LT、iNaturalist 2018の3つのロングテール認識ベンチマークで最先端の結果を達成しています。SSDは強力なLWSベースラインを2.7％から4.5％上回っています。さまざまなデータセットで。コードはhttps://github.com/MCG-NJU/SSD-LTで入手できます。

Deep learning has achieved remarkable progress for visual recognition on large-scale balanced datasets but still performs poorly on real-world long-tailed data. Previous methods often adopt class re-balanced training strategies to effectively alleviate the imbalance issue, but might be a risk of over-fitting tail classes. The recent decoupling method overcomes over-fitting issues by using a multi-stage training scheme, yet, it is still incapable of capturing tail class information in the feature learning stage. In this paper, we show that soft label can serve as a powerful solution to incorporate label correlation into a multi-stage training scheme for long-tailed recognition. The intrinsic relation between classes embodied by soft labels turns out to be helpful for long-tailed recognition by transferring knowledge from head to tail classes. Specifically, we propose a conceptually simple yet particularly effective multi-stage training scheme, termed as Self Supervised to Distillation (SSD). This scheme is composed of two parts. First, we introduce a self-distillation framework for long-tailed recognition, which can mine the label relation automatically. Second, we present a new distillation label generation module guided by self-supervision. The distilled labels integrate information from both label and data domains that can model long-tailed distribution effectively. We conduct extensive experiments and our method achieves the state-of-the-art results on three long-tailed recognition benchmarks: ImageNet-LT, CIFAR100-LT and iNaturalist 2018. Our SSD outperforms the strong LWS baseline by from 2.7% to 4.5% on various datasets. The code is available at https://github.com/MCG-NJU/SSD-LT.

updated: Thu Sep 09 2021 07:38:30 GMT+0000 (UTC)

published: Thu Sep 09 2021 07:38:30 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト