A Clustering-guided Contrastive Fusion for Multi-view Representation Learning

Guanzhou Ke; Guoqing Chao; Xiaoli Wang; Chenyang Xu; Yongqi Zhu; Yang Yu

マルチビュー表現学習のためのクラスタリングに基づく対照的融合

過去 20 年間で、マルチビューアプリケーションの開発を容易にするために多様なドメインから有用な情報を抽出するマルチビュー表現学習の分野で急速な進歩が見られました。ただし、コミュニティは 2 つの課題に直面しています。i) 大量のラベルなしデータからノイズや不完全なビュー設定に対して堅牢な表現を学習する方法、および ii) ビューの一貫性とさまざまなダウンストリームタスクの補完のバランスを取る方法。この目的のために、ディープフュージョンネットワークを利用してビュー固有の表現をビュー共通表現に融合し、堅牢な表現を取得するための高レベルのセマンティクスを抽出します。さらに、クラスタリングタスクを使用して融合ネットワークを誘導し、それが些細な解決策につながるのを防ぎます。一貫性と補完性のバランスを取るために、ビュー共通の表現と各ビュー固有の表現を揃える非対称の対照的な戦略を設計します。これらのモジュールは、Clustering-guided contrastive fusioN (CLOVEN) として知られる統一された方法に組み込まれています。提案された方法を5つのデータセットで定量的および定性的に評価し、クラスタリングと分類においてCLOVENが11の競合するマルチビュー学習方法よりも優れていることを実証しました。不完全なビューのシナリオでは、提案された方法は、競合他社の方法よりもノイズ干渉に強くなります。さらに、視覚化分析は、CLOVEN がビュー固有の表現の本質的な構造を維持しながら、ビュー共通表現のコンパクトさを改善できることを示しています。私たちのソースコードは、https://github.com/guanzhou-ke/cloven ですぐに利用できるようになります。

The past two decades have seen increasingly rapid advances in the field of multi-view representation learning due to it extracting useful information from diverse domains to facilitate the development of multi-view applications. However, the community faces two challenges: i) how to learn robust representations from a large amount of unlabeled data to against noise or incomplete views setting, and ii) how to balance view consistency and complementary for various downstream tasks. To this end, we utilize a deep fusion network to fuse view-specific representations into the view-common representation, extracting high-level semantics for obtaining robust representation. In addition, we employ a clustering task to guide the fusion network to prevent it from leading to trivial solutions. For balancing consistency and complementary, then, we design an asymmetrical contrastive strategy that aligns the view-common representation and each view-specific representation. These modules are incorporated into a unified method known as CLustering-guided cOntrastiVE fusioN (CLOVEN). We quantitatively and qualitatively evaluate the proposed method on five datasets, demonstrating that CLOVEN outperforms 11 competitive multi-view learning methods in clustering and classification. In the incomplete view scenario, our proposed method resists noise interference better than those of our competitors. Furthermore, the visualization analysis shows that CLOVEN can preserve the intrinsic structure of view-specific representation while also improving the compactness of view-commom representation. Our source code will be available soon at https://github.com/guanzhou-ke/cloven.

updated: Thu Aug 03 2023 08:31:53 GMT+0000 (UTC)

published: Wed Dec 28 2022 07:21:05 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト