A Contrastive Objective for Learning Disentangled Representations

Jonathan Kahana; Yedid Hoshen

解きほぐされた表現を学習するための対照的な目的

敏感な属性や不要な属性に対して不変である画像の表現を学習することは、バイアスの除去やクロスドメインの取得を含む多くのタスクにとって重要です。ここでの目的は、ラベルが提供されているドメイン（機密属性）に対して不変であり、ラベルが付けられていない他のすべての画像属性について情報を提供する表現を学習することです。不変表現を保証するための新しいドメインごとの対照的な目的を提案する、新しいアプローチを提示します。この目的は、ネガティブな画像ペアが同じドメインから描画されることを決定的に制限します。これにより、ドメインの不変性が強制されますが、標準の対照的な目的では強制されません。このドメインごとの目的は、機能の抑制につながるショートカットソリューションに悩まされているため、それだけでは不十分です。この問題は、再構成の制約、画像の拡張、および事前にトレーニングされた重みを使用した初期化の組み合わせによって克服されます。私たちの分析は、拡張の選択が重要であり、拡張の誤った選択が不変性と有益性の目的を損なう可能性があることを示しています。広範な評価において、私たちの方法は、表現の不変性、表現の有益性、およびトレーニング速度の点で、最先端の方法を納得のいくように上回っています。さらに、場合によっては、再構成の制約がなくても、私たちの方法で優れた結果が得られ、はるかに高速でリソース効率の高いトレーニングにつながることがわかります。

Learning representations of images that are invariant to sensitive or unwanted attributes is important for many tasks including bias removal and cross domain retrieval. Here, our objective is to learn representations that are invariant to the domain (sensitive attribute) for which labels are provided, while being informative over all other image attributes, which are unlabeled. We present a new approach, proposing a new domain-wise contrastive objective for ensuring invariant representations. This objective crucially restricts negative image pairs to be drawn from the same domain, which enforces domain invariance whereas the standard contrastive objective does not. This domain-wise objective is insufficient on its own as it suffers from shortcut solutions resulting in feature suppression. We overcome this issue by a combination of a reconstruction constraint, image augmentations and initialization with pre-trained weights. Our analysis shows that the choice of augmentations is important, and that a misguided choice of augmentations can harm the invariance and informativeness objectives. In an extensive evaluation, our method convincingly outperforms the state-of-the-art in terms of representation invariance, representation informativeness, and training speed. Furthermore, we find that in some cases our method can achieve excellent results even without the reconstruction constraint, leading to a much faster and resource efficient training.

updated: Mon Mar 21 2022 18:56:36 GMT+0000 (UTC)

published: Mon Mar 21 2022 18:56:36 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト