Attention-based Dual Supervised Decoder for RGBD Semantic Segmentation

Yang Zhang; Yang Yang; Chenyun Xiong; Guodong Sun; Yanwen Guo

RGBDセマンティックセグメンテーションのための注意ベースの二重教師ありデコーダ

エンコーダー-デコーダーモデルはRGBDセマンティックセグメンテーションで広く使用されており、それらのほとんどは2ストリームネットワークを介して設計されています。一般に、RGBDからの色と幾何学的情報を共同で推論することは、セマンティックセグメンテーションに有益です。ただし、ほとんどの既存のアプローチでは、エンコーダーとデコーダーの両方でマルチモーダル情報を包括的に利用できません。この論文では、RGBDセマンティックセグメンテーションのための新しい注意ベースの二重教師ありデコーダを提案します。エンコーダーでは、シンプルでありながら効果的な注意ベースのマルチモーダル融合モジュールを設計して、深くマルチレベルのペアの補完情報を抽出して融合します。より堅牢な深い表現と豊富なマルチモーダル情報を学習するために、さまざまなタスクの相関関係と補完的な手がかりを効果的に活用するデュアルブランチデコーダーを導入します。 NYUDv2およびSUN-RGBDデータセットでの広範な実験は、私たちの方法が最先端の方法に対して優れたパフォーマンスを達成することを示しています。

Encoder-decoder models have been widely used in RGBD semantic segmentation, and most of them are designed via a two-stream network. In general, jointly reasoning the color and geometric information from RGBD is beneficial for semantic segmentation. However, most existing approaches fail to comprehensively utilize multimodal information in both the encoder and decoder. In this paper, we propose a novel attention-based dual supervised decoder for RGBD semantic segmentation. In the encoder, we design a simple yet effective attention-based multimodal fusion module to extract and fuse deeply multi-level paired complementary information. To learn more robust deep representations and rich multi-modal information, we introduce a dual-branch decoder to effectively leverage the correlations and complementary cues of different tasks. Extensive experiments on NYUDv2 and SUN-RGBD datasets demonstrate that our method achieves superior performance against the state-of-the-art methods.

updated: Tue Mar 15 2022 03:29:15 GMT+0000 (UTC)

published: Wed Jan 05 2022 03:12:27 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト