Learning ABCs: Approximate Bijective Correspondence for isolating factors of variation with weak supervision

Kieran A. Murphy; Varun Jampani; Srikumar Ramalingam; Ameesh Makadia

ABCの学習：弱い監視で変動の要因を分離するための近似全単射対応

表象学習は、ほとんどの深層学習アプリケーションのバックボーンを形成し、学習された表象の価値は、さまざまな変動要因に関する情報コンテンツと密接に関連しています。適切な表現を見つけることは、監督の性質と学習アルゴリズムに依存します。各セットの要素間で不変である特定の非アクティブな（共通の）変動要因に従ってデータがセットに分割される、弱い形式の監視を利用する新しいアルゴリズムを提案します。私たちの重要な洞察は、異なるセットの要素間の対応を探すことによって、変動の非アクティブな要因を除外し、すべてのセット内で変動するアクティブな要因を分離する強力な表現を学習することです。アクティブな要因に焦点を当てた結果として、私たちの方法は、異なるドメインに属することさえできる、セットで監視されたデータと完全に監視されていないデータの組み合わせを活用できます。カテゴリレベルに一般化するポーズ情報を合成/実ドメインのギャップ全体に分離することにより、何にもポーズ注釈を付けずに、合成から実オブジェクトへのオブジェクトポーズ転送の困難な問題に取り組みます。この方法は、中間表現を強化することにより、監視あり設定でのパフォーマンスを向上させるだけでなく、量が制限され、変動の妨害要因がより豊富な、設定監視あり自然画像を使用して実際に達成可能なシナリオで動作します。

Representational learning forms the backbone of most deep learning applications, and the value of a learned representation is intimately tied to its information content regarding different factors of variation. Finding good representations depends on the nature of supervision and the learning algorithm. We propose a novel algorithm that utilizes a weak form of supervision where the data is partitioned into sets according to certain inactive (common) factors of variation which are invariant across elements of each set. Our key insight is that by seeking correspondence between elements of different sets, we learn strong representations that exclude the inactive factors of variation and isolate the active factors that vary within all sets. As a consequence of focusing on the active factors, our method can leverage a mix of set-supervised and wholly unsupervised data, which can even belong to a different domain. We tackle the challenging problem of synthetic-to-real object pose transfer, without pose annotations on anything, by isolating pose information which generalizes to the category level and across the synthetic/real domain gap. The method can also boost performance in supervised settings, by strengthening intermediate representations, as well as operate in practically attainable scenarios with set-supervised natural images, where quantity is limited and nuisance factors of variation are more plentiful.

updated: Wed Mar 30 2022 15:09:20 GMT+0000 (UTC)

published: Thu Mar 04 2021 18:58:45 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト