Structured Latent Embeddings for Recognizing Unseen Classes in Unseen Domains

Shivam Chandhok; Sanath Narayan; Hisham Cholakkal; Rao Muhammad Anwer; Vineeth N Balasubramanian; Fahad Shahbaz Khan; Ling Shao

見えないドメインの見えないクラスを認識するための構造化された潜在的な埋め込み

タスク固有の注釈付きデータの不足に対処する必要性により、近年、セマンティックシフトとドメインシフトの問題に個別に対処するために、ゼロショット学習（ZSL）やドメイン一般化（DG）などの特定の設定に対する協調的な取り組みが行われています。、それぞれ。ただし、実際のアプリケーションには制約のある設定がないことが多く、見えないドメインで見えないクラスを処理する必要があります。これは、ドメインとセマンティックシフトの問題を同時に提示するゼロショットドメイン一般化と呼ばれる設定です。この作業では、異なるドメインからの画像とクラス固有のセマンティックテキストベースの表現を共通の潜在空間に投影することにより、ドメインにとらわれない構造化された潜在埋め込みを学習する新しいアプローチを提案します。特に、私たちの方法は、以下の目的のために共同で努力しています。（i）視覚的およびテキストベースの意味論的概念からのマルチモーダルキューを調整する。（ii）ドメインにとらわれないクラスレベルのセマンティック概念に従って共通の潜在空間を分割する。（iii）見えないドメインの見えないクラスに一般化するための視覚的意味的同時分布によるドメイン不変性の学習。挑戦的なDomainNetおよびDomainNet-LSベンチマークでの実験は、既存の方法に対するアプローチの優位性を示しており、クイックドローやスケッチなどの難しいドメインで大幅な向上が見られます。

The need to address the scarcity of task-specific annotated data has resulted in concerted efforts in recent years for specific settings such as zero-shot learning (ZSL) and domain generalization (DG), to separately address the issues of semantic shift and domain shift, respectively. However, real-world applications often do not have constrained settings and necessitate handling unseen classes in unseen domains -- a setting called Zero-shot Domain Generalization, which presents the issues of domain and semantic shifts simultaneously. In this work, we propose a novel approach that learns domain-agnostic structured latent embeddings by projecting images from different domains as well as class-specific semantic text-based representations to a common latent space. In particular, our method jointly strives for the following objectives: (i) aligning the multimodal cues from visual and text-based semantic concepts; (ii) partitioning the common latent space according to the domain-agnostic class-level semantic concepts; and (iii) learning a domain invariance w.r.t the visual-semantic joint distribution for generalizing to unseen classes in unseen domains. Our experiments on the challenging DomainNet and DomainNet-LS benchmarks show the superiority of our approach over existing methods, with significant gains on difficult domains like quickdraw and sketch.

updated: Mon Jul 12 2021 17:57:46 GMT+0000 (UTC)

published: Mon Jul 12 2021 17:57:46 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト