Robust Training Using Natural Transformation

Shuo Wang; Lingjuan Lyu; Surya Nepal; Carsten Rudolph; Marthie Grobler; Kristen Moore

自然変換を使用した堅牢なトレーニング

データ変換や敵対的トレーニングによるデータ拡張手法などの深層学習モデルに対する以前の堅牢性アプローチでは、照明条件の変化など、入力のセマンティクスを保持する実際のバリエーションをキャプチャできません。このギャップを埋めるために、画像分類アルゴリズムの堅牢性を向上させるように設計された敵対的なトレーニングスキームであるNaTraを紹介します。クラスIDに依存しない入力画像の属性をターゲットにし、それらの属性を操作して、入力の実際の自然変換（NaTra）を模倣します。これは、画像分類器のトレーニングデータセットを拡張するために使用されます。具体的には、Batch Inverse Encoding and Shiftingを適用して、指定された画像のバッチを、十分にトレーニングされた生成モデルの対応する解きほぐされた潜在コードにマッピングします。潜在コード拡張は、拡張された特徴マップを組み込むことにより、画像再構成の品質を高めるために使用されます。教師なし属性の指示と操作により、特定の属性の変更に対応する潜在的な方向を識別し、それらの属性の解釈可能な操作を生成して、入力データへの自然変換を生成できます。十分に訓練されたGANから導出された解きほぐされた潜在表現を利用して、現実世界の自然変動（照明条件や髪型など）に類似した画像の変換を模倣し、モデルを不変になるように訓練することによって、スキームの有効性を示します。これらの自然な変化。広範な実験は、私たちの方法が分類モデルの一般化を改善し、さまざまな実世界の歪みに対するロバスト性を高めることを示しています

Previous robustness approaches for deep learning models such as data augmentation techniques via data transformation or adversarial training cannot capture real-world variations that preserve the semantics of the input, such as a change in lighting conditions. To bridge this gap, we present NaTra, an adversarial training scheme that is designed to improve the robustness of image classification algorithms. We target attributes of the input images that are independent of the class identification, and manipulate those attributes to mimic real-world natural transformations (NaTra) of the inputs, which are then used to augment the training dataset of the image classifier. Specifically, we apply Batch Inverse Encoding and Shifting to map a batch of given images to corresponding disentangled latent codes of well-trained generative models. Latent Codes Expansion is used to boost image reconstruction quality through the incorporation of extended feature maps. Unsupervised Attribute Directing and Manipulation enables identification of the latent directions that correspond to specific attribute changes, and then produce interpretable manipulations of those attributes, thereby generating natural transformations to the input data. We demonstrate the efficacy of our scheme by utilizing the disentangled latent representations derived from well-trained GANs to mimic transformations of an image that are similar to real-world natural variations (such as lighting conditions or hairstyle), and train models to be invariant to these natural transformations. Extensive experiments show that our method improves generalization of classification models and increases its robustness to various real-world distortions

updated: Mon May 10 2021 01:56:03 GMT+0000 (UTC)

published: Mon May 10 2021 01:56:03 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト