ResLT: Residual Learning for Long-tailed Recognition

Jiequan Cui; Shu Liu; Zhuotao Tian; Zhisheng Zhong; Jiaya Jia

ResLT：ロングテール認識のための残余学習

ディープラーニングアルゴリズムは、ロングテールのデータ分散で大きな課題に直面しますが、これは実際のシナリオでは非常に一般的なケースです。以前の方法は、入力空間（異なる周波数のクラスの再サンプリング）または損失空間（異なる重みのクラスの再重み付け）のいずれかの側面から問題に取り組み、トレーニング中のテールクラスへの過剰適合またはハード最適化に悩まされていました。これらの問題を軽減するために、我々はロングテール認識のより基本的な視点、すなわちパラメータ空間の観点から提案し、低周波数のクラスの特定の容量を維持することを目指しています。この観点から、自明なソリューションでは、ヘッド、ミディアム、テールの各クラスにそれぞれ異なるブランチを使用し、最終的な結果が得られないため、それらの出力を合計します。代わりに、効果的な残差融合メカニズムを設計します。1つのメインブランチがすべてのクラスの画像を認識するように最適化され、別の2つの残差ブランチが徐々に融合および最適化されて、それぞれミディアム+テールクラスおよびテールクラスの画像が強化されます。次に、追加のショートカットによってブランチが最終結果に集約されます。いくつかのベンチマーク、つまり、CIFAR-10、CIFAR-100、Places、ImageNet、およびiNaturalist 2018のロングテールバージョンでメソッドをテストします。実験結果は、メソッドの有効性を示しています。私たちのコードはhttps://github.com/jiequancui/ResLTで入手できます。

Deep learning algorithms face great challenges with long-tailed data distribution which, however, is quite a common case in real-world scenarios. Previous methods tackle the problem from either the aspect of input space (re-sampling classes with different frequencies) or loss space (re-weighting classes with different weights), suffering from heavy over-fitting to tail classes or hard optimization during training. To alleviate these issues, we propose a more fundamental perspective for long-tailed recognition, i.e., from the aspect of parameter space, and aims to preserve specific capacity for classes with low frequencies. From this perspective, the trivial solution utilizes different branches for the head, medium, and tail classes respectively, and then sums their outputs as the final results is not feasible. Instead, we design the effective residual fusion mechanism -- with one main branch optimized to recognize images from all classes, another two residual branches are gradually fused and optimized to enhance images from medium+tail classes and tail classes respectively. Then the branches are aggregated into final results by additive shortcuts. We test our method on several benchmarks, i.e., long-tailed version of CIFAR-10, CIFAR-100, Places, ImageNet, and iNaturalist 2018. Experimental results manifest the effectiveness of our method. Our code is available at https://github.com/jiequancui/ResLT.

updated: Thu May 12 2022 02:07:55 GMT+0000 (UTC)

published: Tue Jan 26 2021 08:43:50 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト