Pure Noise to the Rescue of Insufficient Data: Improving Imbalanced Classification by Training on Random Noise Images

Shiran Zada; Itay Benou; Michal Irani

不十分なデータの救済への純粋なノイズ：ランダムノイズ画像のトレーニングによる不均衡な分類の改善

視覚認識タスクの目覚ましい進歩にもかかわらず、トレーニングデータが不足している、または非常に不均衡な場合、ディープニューラルネットは依然として一般化に苦労しており、実際の例に対して非常に脆弱になっています。この論文では、この制限を緩和するための驚くほどシンプルでありながら非常に効果的な方法を紹介します。追加のトレーニングデータとして純粋なノイズ画像を使用します。データ拡張のための加法性ノイズまたは敵対的ノイズの一般的な使用とは異なり、純粋なランダムノイズ画像を直接トレーニングすることにより、まったく異なる視点を提案します。同じネットワーク内の自然画像に加えて純粋なノイズ画像のトレーニングを可能にする、新しい分散対応ルーティングバッチ正規化レイヤー（DAR-BN）を紹介します。これにより、一般化が促進され、過剰適合が抑制されます。私たちの提案する方法は、不均衡な分類パフォーマンスを大幅に改善し、多種多様なロングテール画像分類データセット（CIFAR-10-LT、CIFAR-100-LT、ImageNet-LT、Places-LT、およびCelebA-5）。さらに、私たちの方法は非常にシンプルで、一般的な新しい拡張ツールとして（既存の拡張に加えて）使いやすく、任意のトレーニングスキームに組み込むことができます。特別なデータ生成やトレーニング手順を必要としないため、トレーニングを迅速かつ効率的に維持できます。

Despite remarkable progress on visual recognition tasks, deep neural-nets still struggle to generalize well when training data is scarce or highly imbalanced, rendering them extremely vulnerable to real-world examples. In this paper, we present a surprisingly simple yet highly effective method to mitigate this limitation: using pure noise images as additional training data. Unlike the common use of additive noise or adversarial noise for data augmentation, we propose an entirely different perspective by directly training on pure random noise images. We present a new Distribution-Aware Routing Batch Normalization layer (DAR-BN), which enables training on pure noise images in addition to natural images within the same network. This encourages generalization and suppresses overfitting. Our proposed method significantly improves imbalanced classification performance, obtaining state-of-the-art results on a large variety of long-tailed image classification datasets (CIFAR-10-LT, CIFAR-100-LT, ImageNet-LT, Places-LT, and CelebA-5). Furthermore, our method is extremely simple and easy to use as a general new augmentation tool (on top of existing augmentations), and can be incorporated in any training scheme. It does not require any specialized data generation or training procedures, thus keeping training fast and efficient.

updated: Sat Jun 18 2022 15:08:42 GMT+0000 (UTC)

published: Thu Dec 16 2021 11:51:35 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト