Diversity Matters: Fully Exploiting Depth Clues for Reliable Monocular 3D Object Detection

Zhuoling Li; Zhan Qu; Yang Zhou; Jianzhuang Liu; Haoqian Wang; Lihui Jiang

多様性の問題：信頼性の高い単眼3Dオブジェクト検出のための深さの手がかりを十分に活用

本質的に不適切な問題として、単一画像からの深度推定は、単眼3Dオブジェクト検出（M3OD）の最も困難な部分です。多くの既存の方法は、単眼画像で欠落している空間情報を橋渡しし、関心のあるすべてのオブジェクトの唯一の深度値を予測するために、先入観のある仮定に依存しています。ただし、これらの仮定は、実際のアプリケーションでは常に当てはまるとは限りません。この問題に取り組むために、M3ODのサブタスクから視覚的な手がかりを完全に調査し、各ターゲットの深度について複数の推定値を生成する深度解決システムを提案します。深度推定は本質的に異なる仮定に依存しているため、それらは多様な分布を示します。一部の仮定が崩壊したとしても、残りの仮定に基づいて確立された推定は依然として信頼できます。さらに、深さの選択と組み合わせの戦略を開発します。この戦略は、崩壊した仮定によって引き起こされた異常な推定を取り除き、残りの推定を1つの推定に適応的に組み合わせることができます。このようにして、私たちの深度解決システムはより正確で堅牢になります。 M3ODの複数のサブタスクからの手がかりを活用し、追加情報を導入することなく、この方法は、リアルタイムの効率を維持しながら、KITTI 3Dオブジェクト検出ベンチマークのテスト分割の適度なレベルで現在の最良の方法を20％以上上回っています。。

As an inherently ill-posed problem, depth estimation from single images is the most challenging part of monocular 3D object detection (M3OD). Many existing methods rely on preconceived assumptions to bridge the missing spatial information in monocular images, and predict a sole depth value for every object of interest. However, these assumptions do not always hold in practical applications. To tackle this problem, we propose a depth solving system that fully explores the visual clues from the subtasks in M3OD and generates multiple estimations for the depth of each target. Since the depth estimations rely on different assumptions in essence, they present diverse distributions. Even if some assumptions collapse, the estimations established on the remaining assumptions are still reliable. In addition, we develop a depth selection and combination strategy. This strategy is able to remove abnormal estimations caused by collapsed assumptions, and adaptively combine the remaining estimations into a single one. In this way, our depth solving system becomes more precise and robust. Exploiting the clues from multiple subtasks of M3OD and without introducing any extra information, our method surpasses the current best method by more than 20% relatively on the Moderate level of test split in the KITTI 3D object detection benchmark, while still maintaining real-time efficiency.

updated: Thu May 19 2022 08:12:55 GMT+0000 (UTC)

published: Thu May 19 2022 08:12:55 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト