SwinNet: Swin Transformer drives edge-aware RGB-D and RGB-T salient object detection

Zhengyi Liu; Yacheng Tan; Qian He; Yun Xiao

SwinNet：Swin Transformerは、エッジを意識したRGB-DおよびRGB-Tの顕著なオブジェクト検出を駆動します

畳み込みニューラルネットワーク（CNN）は、特定の受容野内のコンテキスト特徴を抽出するのに優れていますが、トランスフォーマーは、グローバルな長距離依存性特徴をモデル化できます。 Swin Transformerは、トランスフォーマーのメリットとCNNのメリットを吸収することで、強力な機能表現能力を発揮します。これに基づいて、RGB-DおよびRGB-Tの顕著な物体検出のためのクロスモダリティ融合モデルSwinNetを提案します。これは、Swin Transformerによって駆動されて階層的特徴を抽出し、注意メカニズムによってブーストされて2つのモダリティ間のギャップを埋め、エッジ情報によってガイドされて顕著なオブジェクトの輪郭をシャープにします。具体的には、2ストリームのSwin Transformerエンコーダーが最初にマルチモダリティ機能を抽出し、次に空間アライメントとチャネル再キャリブレーションモジュールが提示されてレベル内のクロスモダリティ機能が最適化されます。ファジー境界を明確にするために、エッジガイドデコーダーは、エッジ機能のガイダンスの下でレベル間クロスモダリティフュージョンを実現します。提案されたモデルは、RGB-DおよびRGB-Tデータセットの最新モデルよりも優れており、クロスモダリティ相補性タスクへのより多くの洞察を提供することを示しています。

Convolutional neural networks (CNNs) are good at extracting contexture features within certain receptive fields, while transformers can model the global long-range dependency features. By absorbing the advantage of transformer and the merit of CNN, Swin Transformer shows strong feature representation ability. Based on it, we propose a cross-modality fusion model SwinNet for RGB-D and RGB-T salient object detection. It is driven by Swin Transformer to extract the hierarchical features, boosted by attention mechanism to bridge the gap between two modalities, and guided by edge information to sharp the contour of salient object. To be specific, two-stream Swin Transformer encoder first extracts multi-modality features, and then spatial alignment and channel re-calibration module is presented to optimize intra-level cross-modality features. To clarify the fuzzy boundary, edge-guided decoder achieves inter-level cross-modality fusion under the guidance of edge features. The proposed model outperforms the state-of-the-art models on RGB-D and RGB-T datasets, showing that it provides more insight into the cross-modality complementarity task.

updated: Tue Apr 12 2022 07:37:39 GMT+0000 (UTC)

published: Tue Apr 12 2022 07:37:39 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト