Sparse-to-dense Feature Matching: Intra and Inter domain Cross-modal Learning in Domain Adaptation for 3D Semantic Segmentation

Duo Peng; Yinjie Lei; Wen Li; Pingping Zhang; Yulan Guo

スパースからデンスへの特徴マッチング：3Dセマンティックセグメンテーションのためのドメイン適応におけるドメイン内およびドメイン間クロスモーダル学習

ドメインの適応は、新しいドメインでの注釈の欠如に直面したときに成功するために重要です。 3Dポイントクラウドでのラベリングプロセスに多大な時間がかかるため、3Dセマンティックセグメンテーションへのドメイン適応は非常に期待されています。マルチモーダルデータセットの台頭により、3D点群以外に大量の2D画像にアクセスできるようになりました。これに照らして、ドメイン内およびドメイン間クロスモーダル学習による3Dドメイン適応のために2Dデータをさらに活用することを提案します。ドメイン内クロスモーダル学習に関しては、ほとんどの既存の作品は、密な2Dピクセル単位の特徴を、疎な3D点単位の特徴と同じサイズにサンプリングし、その結果、多数の有用な2D特徴が放棄されています。この問題に対処するために、ドメイン適応のためのマルチモダリティ情報相互作用の十分性を高めるために、動的スパースからデンスへのクロスモーダル学習（DsCML）を提案します。ドメイン間クロスモーダル学習では、高レベルのモーダル相補性を促進することを目的とした、さまざまなセマンティックコンテンツを含む2Dおよび3Dデータでのクロスモーダル敵対学習（CMAL）をさらに進めます。昼から夜、国から国、データセットからデータセットなど、さまざまなマルチモダリティドメイン適応設定でモデルを評価し、すべての設定でユニモーダルとマルチモーダルの両方のドメイン適応方法を大幅に改善します。

Domain adaptation is critical for success when confronting with the lack of annotations in a new domain. As the huge time consumption of labeling process on 3D point cloud, domain adaptation for 3D semantic segmentation is of great expectation. With the rise of multi-modal datasets, large amount of 2D images are accessible besides 3D point clouds. In light of this, we propose to further leverage 2D data for 3D domain adaptation by intra and inter domain cross modal learning. As for intra-domain cross modal learning, most existing works sample the dense 2D pixel-wise features into the same size with sparse 3D point-wise features, resulting in the abandon of numerous useful 2D features. To address this problem, we propose Dynamic sparse-to-dense Cross Modal Learning (DsCML) to increase the sufficiency of multi-modality information interaction for domain adaptation. For inter-domain cross modal learning, we further advance Cross Modal Adversarial Learning (CMAL) on 2D and 3D data which contains different semantic content aiming to promote high-level modal complementarity. We evaluate our model under various multi-modality domain adaptation settings including day-to-night, country-to-country and dataset-to-dataset, brings large improvements over both uni-modal and multi-modal domain adaptation methods on all settings.

updated: Sun Aug 08 2021 01:57:48 GMT+0000 (UTC)

published: Fri Jul 30 2021 15:55:55 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト