ReSmooth: Detecting and Utilizing OOD Samples when Training with Data Augmentation

Chenyang Wang; Junjun Jiang; Xiong Zhou; Xianming Liu

ReSmooth: データ拡張によるトレーニング時の OOD サンプルの検出と利用

データ拡張 (DA) は、ディープニューラルネットワークのトレーニングを強化するために広く使用されている手法です。最先端のパフォーマンスを達成する最近の DA 手法は、拡張されたトレーニングサンプルの多様性のニーズを常に満たしています。ただし、多様性の高い拡張戦略では、通常、分布外 (OOD) の拡張サンプルが導入され、これらのサンプルによってパフォーマンスが低下します。この問題を軽減するために、最初に拡張サンプルで OOD サンプルを検出し、次にそれらを活用するフレームワークである ReSmooth を提案します。具体的には、最初にガウス混合モデルを使用して、元のサンプルと拡張されたサンプルの両方の損失分布に適合させ、それに応じてこれらのサンプルを分布内 (ID) サンプルと OOD サンプルに分割します。次に、ID サンプルと OOD サンプルが異なるスムーズラベルで組み込まれる新しいトレーニングを開始します。 ID サンプルと OOD サンプルを不平等に扱うことで、多様な拡張データをより有効に活用できます。さらに、ReSmooth フレームワークを否定的なデータ拡張戦略に組み込みます。意図的に作成された OOD サンプルを適切に処理することにより、負のデータ拡張の分類パフォーマンスが大幅に改善されます。いくつかの分類ベンチマークに関する実験では、ReSmooth を既存の拡張戦略 (RandAugment、rotate、jigsaw など) に簡単に拡張して改善できることが示されています。コードは https://github.com/Chenyang4/ReSmooth で入手できます。

Data augmentation (DA) is a widely used technique for enhancing the training of deep neural networks. Recent DA techniques which achieve state-of-the-art performance always meet the need for diversity in augmented training samples. However, an augmentation strategy that has a high diversity usually introduces out-of-distribution (OOD) augmented samples and these samples consequently impair the performance. To alleviate this issue, we propose ReSmooth, a framework that firstly detects OOD samples in augmented samples and then leverages them. To be specific, we first use a Gaussian mixture model to fit the loss distribution of both the original and augmented samples and accordingly split these samples into in-distribution (ID) samples and OOD samples. Then we start a new training where ID and OOD samples are incorporated with different smooth labels. By treating ID samples and OOD samples unequally, we can make better use of the diverse augmented data. Further, we incorporate our ReSmooth framework with negative data augmentation strategies. By properly handling their intentionally created OOD samples, the classification performance of negative data augmentations is largely ameliorated. Experiments on several classification benchmarks show that ReSmooth can be easily extended to existing augmentation strategies (such as RandAugment, rotate, and jigsaw) and improve on them. Our code is available at https://github.com/Chenyang4/ReSmooth.

updated: Sun Dec 04 2022 06:53:53 GMT+0000 (UTC)

published: Wed May 25 2022 09:29:27 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト