Removing Undesirable Feature Contributions Using Out-of-Distribution Data

Saehyung Lee; Changhwa Park; Hyungyu Lee; Jihun Yi; Jonghyun Lee; Sungroh Yoon

配布外データを使用した望ましくない機能の貢献の削除

いくつかのデータ拡張方法は、ニューラルネットワークのトレーニングと推論の間のギャップを埋めるために、ラベルなし配布（UID）データを展開します。ただし、これらの方法には、UIDデータの可用性と、アルゴリズムの疑似ラベルへの依存性に関して明確な制限があります。ここでは、上記の問題のない分布外（OOD）データを使用することにより、敵対的学習と標準学習の両方の一般化を改善するためのデータ拡張方法を提案します。各学習シナリオでOODデータを使用して理論的に一般化を改善する方法を示し、CIFAR-10、CIFAR-100、およびImageNetのサブセットでの実験で理論的分析を補完します。結果は、人間の観点からはほとんど相関がないように見える画像データ間でも、望ましくない特徴が共有されていることを示しています。また、UIDデータがない場合に使用できる他のデータ拡張方法との比較を通じて、提案された方法の利点を示します。さらに、提案された方法が既存の最先端の敵対的訓練をさらに改善できることを実証する。

Several data augmentation methods deploy unlabeled-in-distribution (UID) data to bridge the gap between the training and inference of neural networks. However, these methods have clear limitations in terms of availability of UID data and dependence of algorithms on pseudo-labels. Herein, we propose a data augmentation method to improve generalization in both adversarial and standard learning by using out-of-distribution (OOD) data that are devoid of the abovementioned issues. We show how to improve generalization theoretically using OOD data in each learning scenario and complement our theoretical analysis with experiments on CIFAR-10, CIFAR-100, and a subset of ImageNet. The results indicate that undesirable features are shared even among image data that seem to have little correlation from a human point of view. We also present the advantages of the proposed method through comparison with other data augmentation methods, which can be used in the absence of UID data. Furthermore, we demonstrate that the proposed method can further improve the existing state-of-the-art adversarial training.

updated: Wed Mar 03 2021 05:40:51 GMT+0000 (UTC)

published: Sun Jan 17 2021 10:26:34 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト