RGB-D Salient Object Detection with Ubiquitous Target Awareness

Yifan Zhao; Jiawei Zhao; Jia Li; Xiaowu Chen

ユビキタスターゲット認識によるRGB-D顕著なオブジェクト検出

従来のRGB-D顕著な物体検出方法は、両方のモダリティで顕著な領域を見つけるための補足情報として深度を活用することを目的としています。ただし、顕著なオブジェクトの検出結果は、キャプチャされた深度データの品質に大きく依存します。この作業では、新しい深度認識フレームワークを使用して、RGB-Dの顕著なオブジェクト検出の問題を解決する最初の試みを行います。このフレームワークは、表現学習の監視としてキャプチャされた深度データを利用して、テストフェーズでRGBデータのみに依存します。フレームワークを構築し、正確な顕著な検出結果を達成するために、RGB-D SODタスクの3つの重要な課題を解決するために、ユビキタスターゲット認識（UTA）ネットワークを提案します。1）深度情報を掘削し、あいまいな領域をマイニングする深度認識モジュール適応深度エラーの重みを介して、2）空間認識クロスモーダル相互作用とチャネル認識クロスレベル相互作用、低レベルの境界キューを活用し、高レベルの顕著なチャネルを増幅します。3）ゲートマルチスケールさまざまなコンテキストスケールでオブジェクトの顕著性を認識するための予測モジュール。提案されているUTAネットワークは、その高性能に加えて、推論のために深度がなく、43FPSでリアルタイムに実行されます。実験的証拠は、提案されたネットワークが5つのパブリックRGB-D SODベンチマークで最先端の方法を大幅に上回っているだけでなく、5つのパブリックRGBSODベンチマークでその拡張性を検証していることを示しています。

Conventional RGB-D salient object detection methods aim to leverage depth as complementary information to find the salient regions in both modalities. However, the salient object detection results heavily rely on the quality of captured depth data which sometimes are unavailable. In this work, we make the first attempt to solve the RGB-D salient object detection problem with a novel depth-awareness framework. This framework only relies on RGB data in the testing phase, utilizing captured depth data as supervision for representation learning. To construct our framework as well as achieving accurate salient detection results, we propose a Ubiquitous Target Awareness (UTA) network to solve three important challenges in RGB-D SOD task: 1) a depth awareness module to excavate depth information and to mine ambiguous regions via adaptive depth-error weights, 2) a spatial-aware cross-modal interaction and a channel-aware cross-level interaction, exploiting the low-level boundary cues and amplifying high-level salient channels, and 3) a gated multi-scale predictor module to perceive the object saliency in different contextual scales. Besides its high performance, our proposed UTA network is depth-free for inference and runs in real-time with 43 FPS. Experimental evidence demonstrates that our proposed network not only surpasses the state-of-the-art methods on five public RGB-D SOD benchmarks by a large margin, but also verifies its extensibility on five public RGB SOD benchmarks.

updated: Wed Sep 08 2021 04:27:29 GMT+0000 (UTC)

published: Wed Sep 08 2021 04:27:29 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト