CDTrans: Cross-domain Transformer for Unsupervised Domain Adaptation

Tongkun Xu; Weihua Chen; Pichao Wang; Fan Wang; Hao Li; Rong Jin

CDTrans：教師なしドメイン適応のためのクロスドメイントランスフォーマー

教師なしドメイン適応（UDA）は、ラベル付きソースドメインから学習した知識を別のラベルなしターゲットドメインに転送することを目的としています。ほとんどの既存のUDAメソッドは、畳み込みニューラルネットワーク（CNN）ベースのフレームワークを使用して、ドメインレベルまたはカテゴリレベルからドメイン不変の特徴表現を学習することに重点を置いています。カテゴリレベルベースのUDAの基本的な問題の1つは、ターゲットドメイン内のサンプルの疑似ラベルの生成です。これは通常、正確なドメインアライメントにはノイズが多すぎて、必然的にUDAのパフォーマンスが低下します。さまざまなタスクでのTransformerの成功により、Transformerの相互注意は、ノイズの多い入力ペアに対してロバストであり、機能の位置合わせが改善されることがわかりました。したがって、このホワイトペーパーでは、Transformerを挑戦的なUDAタスクに採用します。具体的には、正確な入力ペアを生成するために、ターゲットサンプルの疑似ラベルを生成する双方向の中心認識ラベリングアルゴリズムを設計します。疑似ラベルとともに、重み共有トリプルブランチトランスフォーマーフレームワークが提案され、ソース/ターゲット特徴学習とソース-ターゲットドメインアラインメントにそれぞれ自己注意と相互注意を適用します。このような設計は、フレームワークを明示的に強制して、識別可能なドメイン固有の表現とドメイン不変の表現を同時に学習します。提案された方法はCDTrans（クロスドメイントランスフォーマー）と呼ばれ、純粋なトランスフォーマーソリューションでUDAタスクを解決する最初の試みの1つを提供します。広範な実験により、提案された方法がOffice-Home、VisDA-2017、およびDomainNetデータセットで最高のパフォーマンスを達成することが示されています。

Unsupervised domain adaptation (UDA) aims to transfer knowledge learned from a labeled source domain to a different unlabeled target domain. Most existing UDA methods focus on learning domain-invariant feature representation, either from the domain level or category level, using convolution neural networks (CNNs)-based frameworks. One fundamental problem for the category level based UDA is the production of pseudo labels for samples in target domain, which are usually too noisy for accurate domain alignment, inevitably compromising the UDA performance. With the success of Transformer in various tasks, we find that the cross-attention in Transformer is robust to the noisy input pairs for better feature alignment, thus in this paper Transformer is adopted for the challenging UDA task. Specifically, to generate accurate input pairs, we design a two-way center-aware labeling algorithm to produce pseudo labels for target samples. Along with the pseudo labels, a weight-sharing triple-branch transformer framework is proposed to apply self-attention and cross-attention for source/target feature learning and source-target domain alignment, respectively. Such design explicitly enforces the framework to learn discriminative domain-specific and domain-invariant representations simultaneously. The proposed method is dubbed CDTrans (cross-domain transformer), and it provides one of the first attempts to solve UDA tasks with a pure transformer solution. Extensive experiments show that our proposed method achieves the best performance on Office-Home, VisDA-2017, and DomainNet datasets.

updated: Mon Sep 13 2021 17:59:07 GMT+0000 (UTC)

published: Mon Sep 13 2021 17:59:07 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト