Adaptive Methods for Aggregated Domain Generalization

Xavier Thomas; Dhruv Mahajan; Alex Pentland; Abhimanyu Dubey

集約されたドメインの一般化のための適応方法

ドメインの一般化には、トレーニングソースの異種コレクションから分類子を学習することが含まれます。これにより、大規模な学習とパーソナライズされた推論に適用され、類似した未知のターゲットドメインから抽出されたデータに一般化されます。多くの設定では、プライバシーの懸念により、トレーニングデータサンプルのドメインラベルを取得することは禁止されており、代わりにトレーニングポイントの集約されたコレクションのみがあります。ドメインラベルを利用してドメイン不変の特徴表現を作成する既存のアプローチは、この設定には適用できず、一般化可能な分類子を学習するための代替アプローチが必要です。この論文では、この問題に対するドメイン適応型アプローチを提案します。これは、2つのステップで機能します。（a）慎重に選択された特徴空間内でトレーニングデータをクラスター化して疑似ドメインを作成し、（b）これらの疑似ドメインを使用します。入力とそれが属する疑似ドメインの両方に関する情報を使用して予測を行うドメイン適応型分類子を学習します。私たちのアプローチは、ドメインラベルをまったく使用せずに、さまざまなドメイン一般化ベンチマークで最先端のパフォーマンスを実現します。さらに、クラスター情報を使用したドメインの一般化に関する新しい理論的保証を提供します。私たちのアプローチは、アンサンブルベースの方法に適しており、大規模なベンチマークデータセットでも大幅な向上をもたらします。コードは次の場所にあります：https：//github.com/xavierohan/AdaClust_DomainBed

Domain generalization involves learning a classifier from a heterogeneous collection of training sources such that it generalizes to data drawn from similar unknown target domains, with applications in large-scale learning and personalized inference. In many settings, privacy concerns prohibit obtaining domain labels for the training data samples, and instead only have an aggregated collection of training points. Existing approaches that utilize domain labels to create domain-invariant feature representations are inapplicable in this setting, requiring alternative approaches to learn generalizable classifiers. In this paper, we propose a domain-adaptive approach to this problem, which operates in two steps: (a) we cluster training data within a carefully chosen feature space to create pseudo-domains, and (b) using these pseudo-domains we learn a domain-adaptive classifier that makes predictions using information about both the input and the pseudo-domain it belongs to. Our approach achieves state-of-the-art performance on a variety of domain generalization benchmarks without using domain labels whatsoever. Furthermore, we provide novel theoretical guarantees on domain generalization using cluster information. Our approach is amenable to ensemble-based methods and provides substantial gains even on large-scale benchmark datasets. The code can be found at: https://github.com/xavierohan/AdaClust_DomainBed

updated: Thu Dec 09 2021 08:57:01 GMT+0000 (UTC)

published: Thu Dec 09 2021 08:57:01 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト