CAFA: Class-Aware Feature Alignment for Test-Time Adaptation

Sanghun Jung; Jungsoo Lee; Nanhee Kim; Jaegul Choo

CAFA：テスト時間適応のためのクラス対応機能の調整

ディープラーニングの最近の進歩にもかかわらず、ディープネットワークは、トレーニング分布とは異なる新しいデータに直面すると、パフォーマンスの低下に悩まされます。このような問題に対処するために、テスト時間適応（TTA）は、予測を同時に行いながら、モデルをテスト時間のラベルなしテストデータに適応させることを目的としています。 TTAは、トレーニング手順を変更せずに事前トレーニングされたネットワークに適用されます。これにより、すでに整形式のソース配布を適応に利用できます。考えられるアプローチの1つは、テストサンプルの表現空間をソース分布に位置合わせすることです（つまり、特徴の位置合わせ）。ただし、TTAで機能の調整を実行することは、適応中にラベル付けされたソースデータへのアクセスが制限されるという点で特に困難です。つまり、モデルには、クラスを区別する方法でテストデータを学習する機会がありません。これは、ソースデータの監視された損失を介して、他の適応タスク（教師なしドメイン適応など）で実行可能でした。このような観察に基づいて、この論文は、クラス認識機能アライメント（CAFA）と呼ばれる、シンプルでありながら効果的な機能アライメント損失を提案します。これは、1）モデルがクラス識別方式でターゲット表現を学習することを促進し、2）効果的に軽減します。同時に、テスト時間の分布がシフトします。私たちの方法では、以前のアプローチで必要だったハイパーパラメータや追加の損失は必要ありません。私たちは広範な実験を行い、提案された方法が既存のベースラインを一貫して上回っていることを示しています。

Despite recent advancements in deep learning, deep networks still suffer from performance degradation when they face new and different data from their training distributions. Addressing such a problem, test-time adaptation (TTA) aims to adapt a model to unlabeled test data on test time while making predictions simultaneously. TTA applies to pretrained networks without modifying their training procedures, which enables to utilize the already well-formed source distribution for adaptation. One possible approach is to align the representation space of test samples to the source distribution (i.e., feature alignment). However, performing feature alignments in TTA is especially challenging in that the access to labeled source data is restricted during adaptation. That is, a model does not have a chance to learn test data in a class-discriminative manner, which was feasible in other adaptation tasks (e.g., unsupervised domain adaptation) via supervised loss on the source data. Based on such an observation, this paper proposes a simple yet effective feature alignment loss, termed as Class-Aware Feature Alignment (CAFA), which 1) encourages a model to learn target representations in a class-discriminative manner and 2) effectively mitigates the distribution shifts in test time, simultaneously. Our method does not require any hyper-parameters or additional losses, which are required in the previous approaches. We conduct extensive experiments and show our proposed method consistently outperforms existing baselines.

updated: Wed Jun 01 2022 03:02:07 GMT+0000 (UTC)

published: Wed Jun 01 2022 03:02:07 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト