Modality to Modality Translation: An Adversarial Representation Learning and Graph Fusion Network for Multimodal Fusion

Sijie Mai; Haifeng Hu; Songlong Xing

モダリティからモダリティへの変換：マルチモーダルフュージョンのための敵対的表現学習とグラフフュージョンネットワーク

さまざまなモダリティの共同埋め込みスペースを学習することは、マルチモーダル融合にとって非常に重要です。主流のモダリティ融合アプローチはこの目標を達成できず、クロスモーダル融合に大きな影響を与えるモダリティギャップを残します。この論文では、モダリティ不変の埋め込み空間を学習するための新しい敵対的なエンコーダ-デコーダ-分類器フレームワークを提案します。さまざまなモダリティの分布は本質的に異なるため、モダリティのギャップを減らすために、敵対的なトレーニングを使用して、それぞれのエンコーダーを介してソースモダリティの分布をターゲットモダリティの分布に変換します。さらに、再構成損失と分類損失を導入することにより、埋め込みスペースに追加の制約を課します。次に、多段階での単峰性、二峰性、三峰性の相互作用を明示的に探索する階層グラフニューラルネットワークを使用して、エンコードされた表現を融合します。私たちの方法は、複数のデータセットで最先端のパフォーマンスを実現します。学習された埋め込みの視覚化は、我々の方法によって学習された共同埋め込み空間が識別的であることを示唆している。コードはhttps://github.com/TmacMai/ARGF_multimodal_fusionで入手できます。

Learning joint embedding space for various modalities is of vital importance for multimodal fusion. Mainstream modality fusion approaches fail to achieve this goal, leaving a modality gap which heavily affects cross-modal fusion. In this paper, we propose a novel adversarial encoder-decoder-classifier framework to learn a modality-invariant embedding space. Since the distributions of various modalities vary in nature, to reduce the modality gap, we translate the distributions of source modalities into that of target modality via their respective encoders using adversarial training. Furthermore, we exert additional constraints on embedding space by introducing reconstruction loss and classification loss. Then we fuse the encoded representations using hierarchical graph neural network which explicitly explores unimodal, bimodal and trimodal interactions in multi-stage. Our method achieves state-of-the-art performance on multiple datasets. Visualization of the learned embeddings suggests that the joint embedding space learned by our method is discriminative. code is available at: https://github.com/TmacMai/ARGF_multimodal_fusion

updated: Thu Dec 10 2020 01:52:20 GMT+0000 (UTC)

published: Mon Nov 18 2019 08:29:20 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト