TDG: Text-guided Domain Generalization

Geng Liu; Yuxi Wang

TDG: テキストガイドによるドメインの一般化

ドメイン一般化 (DG) は、単一または複数のソースドメインでトレーニングされたモデルを、目に見えないターゲットドメインに一般化しようとします。近年の視覚と言語の事前トレーニング済みモデルの成功の恩恵を受けて、追加のテキスト情報を導入することはドメインの一般化にとって重要であると私たちは主張します。この論文では、ドメイン一般化のための新しいテキストガイド型ドメイン一般化 (TDG) パラダイムを開発します。これには、次の 3 つの側面が含まれます。具体的には、まず、現在のドメインの説明を新しいドメイン関連単語で拡張するための自動単語生成方法を考案します。次に、画像特徴と共通の表現空間を共有する提案された即時学習ベースのテキスト特徴生成手法により、生成されたドメイン情報をテキスト特徴空間に埋め込みます。最後に、入力画像の特徴と生成されたテキストの特徴の両方を利用して、目に見えないターゲットドメインをうまく一般化する特別に設計された分類器をトレーニングします。また、画像エンコーダーも分類器から逆伝播される勾配の監視下で更新されます。私たちの実験結果は、TDG に組み込まれた技術が簡単な実装方法でパフォーマンスに貢献することを示しています。いくつかのドメイン汎化ベンチマークの実験結果は、私たちが提案したフレームワークがドメイン汎化で生成されたテキスト情報を効果的に活用することで優れたパフォーマンスを達成することを示しています。

Domain generalization (DG) attempts to generalize a model trained on single or multiple source domains to the unseen target domain. Benefiting from the success of Visual-and-Language Pre-trained models in recent years, we argue that it is crucial for domain generalization by introducing extra text information. In this paper, we develop a novel Text-guided Domain Generalization (TDG) paradigm for domain generalization, which includes three following aspects. Specifically, we first devise an automatic words generation method to extend the description of current domains with novel domain-relevant words. Then, we embed the generated domain information into the text feature space, by the proposed prompt learning-based text feature generation method, which shares a common representation space with the image feature. Finally, we utilize both input image features and generated text features to train a specially designed classifier that generalizes well on unseen target domains, while the image encoder is also updated under the supervision of gradients back propagated from the classifier. Our experimental results show that the techniques incorporated by TDG contribute to the performance in an easy implementation manner. Experimental results on several domain generalization benchmarks show that our proposed framework achieves superior performance by effectively leveraging generated text information in domain generalization.

updated: Sat Aug 19 2023 07:21:02 GMT+0000 (UTC)

published: Sat Aug 19 2023 07:21:02 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト