CAFA: Class-Aware Feature Alignment for Test-Time Adaptation

Sanghun Jung; Jungsoo Lee; Nanhee Kim; Amirreza Shaban; Byron Boots; Jaegul Choo

CAFA: テスト時間適応のためのクラス対応機能の調整

最近の深層学習の進歩にもかかわらず、深層ニューラルネットワークは、トレーニングデータとは異なる新しいデータに適用されると、パフォーマンスの低下に悩まされ続けています。テスト時適応 (TTA) は、テスト時にラベルのないデータにモデルを適応させることで、この課題に対処することを目的としています。 TTA は、トレーニング手順を変更せずに事前トレーニング済みのネットワークに適用できるため、適応のために整形式のソース分布を利用できます。考えられるアプローチの 1 つは、テストサンプルの表現空間をソースの分布に合わせることです (つまり、機能の配置)。ただし、TTA で機能のアライメントを実行することは、ラベル付けされたソースデータへのアクセスが適応中に制限されるという点で特に困難です。つまり、モデルには、ソースデータの教師付き損失を介して他の適応タスク (教師なしドメイン適応など) で実現可能であった、クラス識別的な方法でテストデータを学習する機会がありません。この観察に基づいて、クラス認識機能アライメント (CAFA) と呼ばれるシンプルで効果的な機能アライメント損失を提案します。これは、1) モデルがクラス識別的な方法でターゲット表現を学習することを促進し、2) 分布を効果的に軽減します。試験時間にシフトします。私たちの方法では、以前のアプローチで必要だったハイパーパラメーターや追加の損失は必要ありません。 6つの異なるデータセットで広範な実験を行い、提案された方法が既存のベースラインよりも一貫して優れていることを示しています.

Despite recent advancements in deep learning, deep neural networks continue to suffer from performance degradation when applied to new data that differs from training data. Test-time adaptation (TTA) aims to address this challenge by adapting a model to unlabeled data at test time. TTA can be applied to pretrained networks without modifying their training procedures, enabling them to utilize a well-formed source distribution for adaptation. One possible approach is to align the representation space of test samples to the source distribution (i.e., feature alignment). However, performing feature alignment in TTA is especially challenging in that access to labeled source data is restricted during adaptation. That is, a model does not have a chance to learn test data in a class-discriminative manner, which was feasible in other adaptation tasks (e.g., unsupervised domain adaptation) via supervised losses on the source data. Based on this observation, we propose a simple yet effective feature alignment loss, termed as Class-Aware Feature Alignment (CAFA), which simultaneously 1) encourages a model to learn target representations in a class-discriminative manner and 2) effectively mitigates the distribution shifts at test time. Our method does not require any hyper-parameters or additional losses, which are required in previous approaches. We conduct extensive experiments on 6 different datasets and show our proposed method consistently outperforms existing baselines.

updated: Mon Sep 04 2023 02:55:32 GMT+0000 (UTC)

published: Wed Jun 01 2022 03:02:07 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト