Calibrating Class Activation Maps for Long-Tailed Visual Recognition

Chi Zhang; Guosheng Lin; Lvlong Lai; Henghui Ding; Qingyao Wu

ロングテール視覚認識のためのクラスアクティベーションマップのキャリブレーション

実世界の視覚認識の問題は、多くの場合、ロングテール分布を示します。この分布では、さまざまなカテゴリで学習するためのデータの量が大幅に不均衡になります。このようなデータ分布で学習された標準分類モデルは、多くの場合、ヘッドクラスに対して偏った予測を行いますが、テールクラスへの一般化は不十分です。この論文では、ロングテール分布からのネットワーク学習を改善するためのCNNの2つの効果的な修正を提示します。最初に、重要な画像領域に基づいてネットワーク予測を実施することにより、ネットワーク分類器の学習と予測を改善するためのクラスアクティベーションマップキャリブレーション（CAMC）モジュールを紹介します。提案されたCAMCモジュールは、データ全体の相関画像領域を強調表示し、これらの領域の表現を強化して、分類のためのより優れたグローバル表現を取得します。さらに、ロングテール問題の表現学習のための正規化された分類器の使用を調査します。私たちの経験的研究は、分類器の出力を適切なスカラーでスケーリングするだけで、ヘッドクラスの精度を失うことなくテールクラスの分類精度を効果的に改善できることを示しています。設計の有効性を検証するために広範な実験を実施し、ImageNet-LT、Places-LT、iNaturalist 2018、CIFAR10-LT、CIFAR100-LTを含む5つのベンチマークで新しい最先端のパフォーマンスを設定しました。

Real-world visual recognition problems often exhibit long-tailed distributions, where the amount of data for learning in different categories shows significant imbalance. Standard classification models learned on such data distribution often make biased predictions towards the head classes while generalizing poorly to the tail classes. In this paper, we present two effective modifications of CNNs to improve network learning from long-tailed distribution. First, we present a Class Activation Map Calibration (CAMC) module to improve the learning and prediction of network classifiers, by enforcing network prediction based on important image regions. The proposed CAMC module highlights the correlated image regions across data and reinforces the representations in these areas to obtain a better global representation for classification. Furthermore, we investigate the use of normalized classifiers for representation learning in long-tailed problems. Our empirical study demonstrates that by simply scaling the outputs of the classifier with an appropriate scalar, we can effectively improve the classification accuracy on tail classes without losing the accuracy of head classes. We conduct extensive experiments to validate the effectiveness of our design and we set new state-of-the-art performance on five benchmarks, including ImageNet-LT, Places-LT, iNaturalist 2018, CIFAR10-LT, and CIFAR100-LT.

updated: Sun Aug 29 2021 05:45:03 GMT+0000 (UTC)

published: Sun Aug 29 2021 05:45:03 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト