Learning Less Generalizable Patterns with an Asymmetrically Trained Double Classifier for Better Test-Time Adaptation

Thomas Duboudin; Emmanuel Dellandréa; Corentin Abgrall; Gilles Hénaff; Liming Chen

非対称にトレーニングされた二重分類器を使用して一般化が困難なパターンを学習し、テスト時間への適応を改善する

ディープニューラルネットワークは、特にトレーニング中に単一のデータドメインしか利用できない場合、トレーニング分布の外で一般化に失敗することがよくあります。テスト時間の適応はこの設定で有望な結果をもたらしましたが、さらなる改善に到達するには、これらのアプローチを、より多様なパターンのセットを学習することを目的としたトレーニング手順の変更と組み合わせる必要があると主張します。実際、テスト時間の適応方法は通常、ショートカット学習現象のため、限られた表現に依存する必要があります。標準的なトレーニングでは、利用可能な予測パターンのサブセットのみが学習されます。このホワイトペーパーでは、最初に、既存のトレーニング時間戦略とテスト時間バッチ正規化 (単純な適応方法) を組み合わせて使用しても、PACS ベンチマークでのテスト時間適応のみが常に改善されるとは限らないことを示します。さらに、Office-Home での実験では、テスト時間のバッチ正規化の有無にかかわらず、標準的なトレーニングを改善するトレーニング時間の方法はほとんどないことが示されています。したがって、サンプル固有のパターンの学習を促進する追加のショートカットパターン回避損失を使用して、分類器のペアと、二次分類器の一般化能力を低下させることによりショートカット学習動作を軽減するショートカットパターン回避損失を使用する新しいアプローチを提案します。一次分類器は通常どおりトレーニングされ、自然な特徴と、より複雑で一般化できない特徴の両方が学習されます。私たちの実験は、私たちの方法が両方のベンチマークで最先端の結果を改善し、テスト時間のバッチ正規化に最も利益をもたらすことを示しています.

Deep neural networks often fail to generalize outside of their training distribution, in particular when only a single data domain is available during training. While test-time adaptation has yielded encouraging results in this setting, we argue that, to reach further improvements, these approaches should be combined with training procedure modifications aiming to learn a more diverse set of patterns. Indeed, test-time adaptation methods usually have to rely on a limited representation because of the shortcut learning phenomenon: only a subset of the available predictive patterns is learned with standard training. In this paper, we first show that the combined use of existing training-time strategies, and test-time batch normalization, a simple adaptation method, does not always improve upon the test-time adaptation alone on the PACS benchmark. Furthermore, experiments on Office-Home show that very few training-time methods improve upon standard training, with or without test-time batch normalization. We therefore propose a novel approach using a pair of classifiers and a shortcut patterns avoidance loss that mitigates the shortcut learning behavior by reducing the generalization ability of the secondary classifier, using the additional shortcut patterns avoidance loss that encourages the learning of samples specific patterns. The primary classifier is trained normally, resulting in the learning of both the natural and the more complex, less generalizable, features. Our experiments show that our method improves upon the state-of-the-art results on both benchmarks and benefits the most to test-time batch normalization.

updated: Mon Oct 17 2022 08:05:38 GMT+0000 (UTC)

published: Mon Oct 17 2022 08:05:38 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト