CDDFuse: Correlation-Driven Dual-Branch Feature Decomposition for Multi-Modality Image Fusion

Zixiang Zhao; Haowen Bai; Jiangshe Zhang; Yulun Zhang; Shuang Xu; Zudi Lin; Radu Timofte; Luc Van Gool

CDDFuse: マルチモダリティ画像融合のための相関駆動型デュアルブランチ特徴分解

マルチモダリティ (MM) 画像融合は、機能的なハイライトや詳細なテクスチャなど、さまざまなモダリティのメリットを維持する融合画像をレンダリングすることを目的としています。クロスモダリティ機能のモデリングと、望ましいモダリティ固有およびモダリティ共有機能の分解における課題に取り組むために、新しい相関駆動機能分解融合 (CDDFuse) ネットワークを提案します。まず、CDDFuse は Restormer ブロックを使用してクロスモダリティの浅い特徴を抽出します。次に、Lite Transformer (LT) ブロックを使用して低頻度のグローバル機能を処理し、Invertible Neural Networks (INN) ブロックを使用して高頻度のローカル情報を抽出するデュアルブランチ Transformer-CNN 機能抽出器を紹介します。埋め込まれた情報に基づいて、低周波の特徴を相関させ、高周波の特徴を相関させないようにするために、相関駆動型の損失がさらに提案されます。次に、LT ベースのグローバルフュージョンレイヤーと INN ベースのローカルフュージョンレイヤーが、融合された画像を出力します。広範な実験により、当社の CDDFuse が赤外可視画像融合や医療画像融合などの複数の融合タスクで有望な結果を達成することが実証されています。また、CDDFuse がダウンストリームの赤外線可視セマンティックセグメンテーションと統合ベンチマークでのオブジェクト検出のパフォーマンスを向上できることも示します。コードは https://github.com/Zhaozixiang1228/MMIF-CDDFuse で入手できます。

Multi-modality (MM) image fusion aims to render fused images that maintain the merits of different modalities, e.g., functional highlight and detailed textures. To tackle the challenge in modeling cross-modality features and decomposing desirable modality-specific and modality-shared features, we propose a novel Correlation-Driven feature Decomposition Fusion (CDDFuse) network. Firstly, CDDFuse uses Restormer blocks to extract cross-modality shallow features. We then introduce a dual-branch Transformer-CNN feature extractor with Lite Transformer (LT) blocks leveraging long-range attention to handle low-frequency global features and Invertible Neural Networks (INN) blocks focusing on extracting high-frequency local information. A correlation-driven loss is further proposed to make the low-frequency features correlated while the high-frequency features uncorrelated based on the embedded information. Then, the LT-based global fusion and INN-based local fusion layers output the fused image. Extensive experiments demonstrate that our CDDFuse achieves promising results in multiple fusion tasks, including infrared-visible image fusion and medical image fusion. We also show that CDDFuse can boost the performance in downstream infrared-visible semantic segmentation and object detection in a unified benchmark. The code is available at https://github.com/Zhaozixiang1228/MMIF-CDDFuse.

updated: Mon Apr 10 2023 10:46:30 GMT+0000 (UTC)

published: Sat Nov 26 2022 02:40:28 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト