Confidence-Guided Data Augmentation for Improved Semi-Supervised Training

Fadoua Khmaissia; Hichem Frigui

改善された半教師付きトレーニングのための信頼性に基づくデータ拡張

画像分類の精度と堅牢性を向上させる新しい戦略を提案します。まず、ベースライン CNN モデルをトレーニングします。次に、誤分類されたすべてのサンプルと、信頼値が低い正しく分類されたサンプルを特定することにより、特徴空間内の困難な領域を特定します。これらのサンプルは、Variational AutoEncoder (VAE) のトレーニングに使用されます。次に、VAE を使用して合成画像を生成します。最後に、生成された合成画像を元のラベル付き画像と組み合わせて使用し、半教師付きの方法で新しいモデルをトレーニングします。 STL10 や CIFAR-100 などのベンチマークデータセットでの実験結果は、合成的に生成されたサンプルがトレーニングデータをさらに多様化し、利用可能なデータのみを使用する完全に監視されたベースラインアプローチと比較して画像分類の改善につながることを示しています。

We propose a new strategy to improve the accuracy and robustness of image classification. First, we train a baseline CNN model. Then, we identify challenging regions in the feature space by identifying all misclassified samples, and correctly classified samples with low confidence values. These samples are then used to train a Variational AutoEncoder (VAE). Next, the VAE is used to generate synthetic images. Finally, the generated synthetic images are used in conjunction with the original labeled images to train a new model in a semi-supervised fashion. Empirical results on benchmark datasets such as STL10 and CIFAR-100 show that the synthetically generated samples can further diversify the training data, leading to improvement in image classification in comparison with the fully supervised baseline approaches using only the available data.

updated: Wed Feb 22 2023 00:09:52 GMT+0000 (UTC)

published: Fri Sep 16 2022 21:23:19 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト