Permuted AdaIN: Reducing the Bias Towards Global Statistics in Image Classification

Oren Nuriel; Sagie Benaim; Lior Wolf

順列AdaIN：画像分類におけるグローバル統計へのバイアスの削減

最近の研究では、畳み込みニューラルネットワーク分類器が形状の手がかりを犠牲にしてテクスチャに過度に依存していることが示されています。形状とローカル画像の手がかりと、グローバル画像統計との間で、類似しているが異なる区別をします。 Permuted Adaptive Instance Normalization（pAdaIN）と呼ばれる私たちの方法は、画像分類器の隠れ層でのグローバル統計の表現を減らします。 pAdaINは、特定のバッチでサンプルを再配置するランダム順列πをサンプリングします。次に、適応インスタンス正規化（AdaIN）が、各（順列されていない）サンプルiのアクティブ化とサンプルπ（i）の対応するアクティブ化の間に適用され、バッチのサンプル間で統計が交換されます。グローバル画像統計が歪んでいるため、このスワッピング手順により、ネットワークは形状やテクスチャなどの手がかりに依存します。確率pのランダム順列とそれ以外の単位元順列を選択することにより、効果の強さを制御できます。 pを正しく選択し、すべての実験で事前に固定し、テストデータを考慮せずに選択した場合、この方法は複数の設定でベースラインを常に上回ります。画像分類では、私たちの方法は、複数のアーキテクチャを使用してCIFAR100とImageNetの両方を改善します。堅牢性の設定では、複数のアーキテクチャのImageNet-CとCifar-100-Cの両方でこの方法が改善されます。ドメイン適応とドメイン一般化の設定では、私たちの方法は、GTAVからCityscapesへの転送学習タスクとPACSベンチマークで最先端の結果を達成します。

Recent work has shown that convolutional neural network classifiers overly rely on texture at the expense of shape cues. We make a similar but different distinction between shape and local image cues, on the one hand, and global image statistics, on the other. Our method, called Permuted Adaptive Instance Normalization (pAdaIN), reduces the representation of global statistics in the hidden layers of image classifiers. pAdaIN samples a random permutation π that rearranges the samples in a given batch. Adaptive Instance Normalization (AdaIN) is then applied between the activations of each (non-permuted) sample i and the corresponding activations of the sample π(i), thus swapping statistics between the samples of the batch. Since the global image statistics are distorted, this swapping procedure causes the network to rely on cues, such as shape or texture. By choosing the random permutation with probability p and the identity permutation otherwise, one can control the effect's strength. With the correct choice of p, fixed apriori for all experiments and selected without considering the test data, our method consistently outperforms baselines in multiple settings. In image classification, our method improves on both CIFAR100 and ImageNet using multiple architectures. In the setting of robustness, our method improves on both ImageNet-C and Cifar-100-C for multiple architectures. In the setting of domain adaptation and domain generalization, our method achieves state of the art results on the transfer learning task from GTAV to Cityscapes and on the PACS benchmark.

updated: Tue Dec 08 2020 15:21:29 GMT+0000 (UTC)

published: Fri Oct 09 2020 16:38:38 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト