Towards Accurate RGB-D Saliency Detection with Complementary Attention and Adaptive Integration

Hong-Bo Bi; Zi-Qi Liu; Kang Wang; Bo Dong; Geng Chen; Ji-Quan Ma

補完的な注意と適応統合による正確なRGB-D顕著性検出に向けて

RGB画像と深度マップからの補足情報に基づく顕著性検出は、最近大きな人気を得ています。この論文では、補完的注意と適応統合ネットワーク（CAAI-Net）を提案します。これは、補完的注意ベースの特徴集中と適応型クロスモーダル特徴融合を統合フレームワークに統合して正確な顕著性検出を行う新しいRGB-D顕著性検出モデルです。具体的には、機能相互作用コンポーネント、補完的注意コンポーネント、およびグローバルコンテキストコンポーネントで構成されるコンテキスト認識補完的注意（CCA）モジュールを提案します。 CCAモジュールは、最初に機能相互作用コンポーネントを利用して、豊富なローカルコンテキスト機能を抽出します。次に、結果の特徴が補完的注意コンポーネントに送られます。補完的注意コンポーネントは、隣接するレベルから生成された補完的注意を使用して、現在のレイヤーで注意を導き、相互の背景の乱れが抑制され、ネットワークが顕著なオブジェクトのある領域により焦点を合わせます。最後に、特別に設計された適応特徴統合（AFI）モジュールを利用します。これは、深度マップの低品質の問題を十分に考慮して、RGBと深度の特徴を適応的に集約します。 6つの挑戦的なベンチマークデータセットでの広範な実験は、CAAI-Netが効果的な顕著性検出モデルであり、4つの広く使用されているメトリックの点で9つの最先端モデルよりも優れていることを示しています。さらに、広範なアブレーション研究により、提案されたCCAおよびAFIモジュールの有効性が確認されています。

Saliency detection based on the complementary information from RGB images and depth maps has recently gained great popularity. In this paper, we propose Complementary Attention and Adaptive Integration Network (CAAI-Net), a novel RGB-D saliency detection model that integrates complementary attention based feature concentration and adaptive cross-modal feature fusion into a unified framework for accurate saliency detection. Specifically, we propose a context-aware complementary attention (CCA) module, which consists of a feature interaction component, a complementary attention component, and a global-context component. The CCA module first utilizes the feature interaction component to extract rich local context features. The resulting features are then fed into the complementary attention component, which employs the complementary attention generated from adjacent levels to guide the attention at the current layer so that the mutual background disturbances are suppressed and the network focuses more on the areas with salient objects. Finally, we utilize a specially-designed adaptive feature integration (AFI) module, which sufficiently considers the low-quality issue of depth maps, to aggregate the RGB and depth features in an adaptive manner. Extensive experiments on six challenging benchmark datasets demonstrate that CAAI-Net is an effective saliency detection model and outperforms nine state-of-the-art models in terms of four widely-used metrics. In addition, extensive ablation studies confirm the effectiveness of the proposed CCA and AFI modules.

updated: Mon Feb 08 2021 08:08:30 GMT+0000 (UTC)

published: Mon Feb 08 2021 08:08:30 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト