DBN-Mix: Training Dual Branch Network Using Bilateral Mixup Augmentation for Long-Tailed Visual Recognition

Jae Soon Baik; In Young Yoon; Jun Won Choi

DBN-Mix：ロングテール視覚認識のためのバイラテラルミックスアップ拡張を使用したデュアルブランチネットワークのトレーニング

長い尾のクラス分布から学ぶという挑戦的な視覚課題への関心が高まっています。トレーニングデータセットの極端なクラスの不均衡は、少数派クラスのデータよりも多数派クラスのデータを認識することを好むようにモデルにバイアスをかけます。最近、デュアルブランチネットワーク（DBN）フレームワークが提案されました。ここでは、2つのブランチネットワークがあります。ロングテール視覚認識の精度を向上させるために、従来のブランチとリバランスブランチが採用されました。リバランスブランチは、リバースサンプラーを使用してクラスバランスの取れたトレーニングサンプルを生成し、クラスの不均衡によるバイアスを軽減します。この戦略はバイアスの処理に非常に成功していますが、トレーニングに逆サンプラーを使用すると、表現学習のパフォーマンスが低下する可能性があります。この問題を軽減するために、従来の方法では、慎重に設計された累積学習戦略を使用しました。この戦略では、リバランスブランチの影響がトレーニングフェーズ全体を通じて徐々に増加します。本研究では、最適化が困難な累積学習を行わずに、DBNのパフォーマンスを向上させるためのシンプルで効果的な方法の開発を目指しています。均一サンプラーからの1つのサンプルを逆サンプラーからの別のサンプルと組み合わせてトレーニングサンプルを生成する、バイラテラルミックスアップ拡張と呼ばれる単純なデータ拡張方法を考案します。さらに、提案されたDBNアーキテクチャの多数派クラスへのバイアスを軽減するクラス条件付き温度スケーリングを提示します。広く使用されているロングテール視覚認識データセットで実行された実験は、両側混合増強がDBNの表現学習パフォーマンスを改善するのに非常に効果的であり、提案された方法がいくつかのカテゴリで最先端のパフォーマンスを達成することを示しています。

There is a growing interest in the challenging visual perception task of learning from long-tailed class distributions. The extreme class imbalance in the training dataset biases the model to prefer to recognize majority-class data over minority-class data. Recently, the dual branch network (DBN) framework has been proposed, where two branch networks; the conventional branch and the re-balancing branch were employed to improve the accuracy of long-tailed visual recognition. The re-balancing branch uses a reverse sampler to generate class-balanced training samples to mitigate bias due to class imbalance. Although this strategy has been quite successful in handling bias, using a reversed sampler for training can degrade the representation learning performance. To alleviate this issue, the conventional method used a carefully designed cumulative learning strategy, in which the influence of the re-balancing branch gradually increases throughout the entire training phase. In this study, we aim to develop a simple yet effective method to improve the performance of DBN without cumulative learning that is difficult to optimize. We devise a simple data augmentation method termed bilateral mixup augmentation, which combines one sample from the uniform sampler with another sample from the reversed sampler to produce a training sample. Furthermore, we present class-conditional temperature scaling that mitigates bias toward the majority class for the proposed DBN architecture. Our experiments performed on widely used long-tailed visual recognition datasets show that bilateral mixup augmentation is quite effective in improving the representation learning performance of DBNs, and that the proposed method achieves state-of-the-art performance for some categories.

updated: Tue Jul 05 2022 17:01:27 GMT+0000 (UTC)

published: Tue Jul 05 2022 17:01:27 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト