Bifurcated backbone strategy for RGB-D salient object detection

Yingjie Zhai; Deng-Ping Fan; Jufeng Yang; Ali Borji; Ling Shao; Junwei Han; Liang Wang

RGB-D顕著オブジェクト検出のための分岐バックボーン戦略

マルチレベル機能の融合は、コンピュータビジョンの基本的なトピックです。さまざまなスケールでオブジェクトを検出、セグメント化、分類するために利用されています。マルチレベル機能がマルチモーダルキューに出会うと、最適な機能集約とマルチモーダル学習戦略がホットポテトになります。このホワイトペーパーでは、RGB-D顕著オブジェクト検出の固有のマルチモーダルおよびマルチレベルの性質を活用して、カスケード接続された新しい改良ネットワークを考案します。特に、最初に、分岐バックボーン戦略（BBS）を使用して、マルチレベル機能を教師と生徒の機能に再グループ化することを提案します。 2番目に、チャネルおよび空間ビューから有益な奥行き手がかりを発掘するために、奥行き拡張モジュール（DEM）を導入します。次に、RGBと深度モダリティが補完的に融合します。私たちのアーキテクチャは、Bifurcated Backbone Strategy Network（BBS-Net）と呼ばれ、シンプルで効率的で、バックボーンに依存しません。広範な実験は、BBS-Netが5つの評価基準の下で8つの挑戦的なデータセットで18のSOTAモデルよりも大幅に優れていることを示しており、私たちのアプローチの優位性を示しています（Sメジャーとトップランクモデル：DMRA-iccv2019の〜4％の改善）。さらに、さまざまなRGB-Dデータセットの一般化能力に関する包括的な分析を提供し、将来の研究のための強力なトレーニングセットを提供します。

Multi-level feature fusion is a fundamental topic in computer vision. It has been exploited to detect, segment and classify objects at various scales. When multi-level features meet multi-modal cues, the optimal feature aggregation and multi-modal learning strategy become a hot potato. In this paper, we leverage the inherent multi-modal and multi-level nature of RGB-D salient object detection to devise a novel cascaded refinement network. In particular, first, we propose to regroup the multi-level features into teacher and student features using a bifurcated backbone strategy (BBS). Second, we introduce a depth-enhanced module (DEM) to excavate informative depth cues from the channel and spatial views. Then, RGB and depth modalities are fused in a complementary way. Our architecture, named Bifurcated Backbone Strategy Network (BBS-Net), is simple, efficient, and backbone-independent. Extensive experiments show that BBS-Net significantly outperforms eighteen SOTA models on eight challenging datasets under five evaluation measures, demonstrating the superiority of our approach (∼4 % improvement in S-measure vs. the top-ranked model: DMRA-iccv2019). In addition, we provide a comprehensive analysis on the generalization ability of different RGB-D datasets and provide a powerful training set for future research.

updated: Wed Aug 18 2021 01:13:38 GMT+0000 (UTC)

published: Mon Jul 06 2020 13:01:30 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト