Learning Geometry-Guided Depth via Projective Modeling for Monocular 3D Object Detection

Yinmin Zhang; Xinzhu Ma; Shuai Yi; Jun Hou; Zhihui Wang; Wanli Ouyang; Dan Xu

単眼3Dオブジェクト検出のための射影モデリングによるジオメトリガイド深度の学習

自動運転の重要なタスクとして、3Dオブジェクト検出は近年大きな進歩を遂げています。ただし、単眼の3Dオブジェクト検出は、深度推定のパフォーマンスが不十分なため、依然として困難な問題です。ほとんどの既存の単眼法は、通常、深度とさまざまな幾何学的要素（境界ボックスのサイズ、3Dオブジェクトの寸法、オブジェクトのポーズなど）の間の重要な関係を無視して、シーンの深度を直接回帰します。この論文では、単眼3Dオブジェクト検出を進めるために、射影モデリングを使用したジオメトリガイド深度推定を学習することを提案します。具体的には、単眼3Dオブジェクト検出ネットワークでの2Dおよび3D深度予測の射影モデリングを使用した原理的なジオメトリ式が考案されています。さらに、提案された式を実装して埋め込み、ジオメトリを意識した深層表現学習を可能にし、深さの推定を促進するための効果的な2Dおよび3Dの相互作用を可能にします。さらに、提案された幾何学的公式を使用して堅牢な学習を保証するために、2D注釈と投影されたボックス間の実質的な不整合に対処することで強力なベースラインを提供します。 KITTIデータセットでの実験は、私たちの方法が、中程度のテスト設定で2.80％余分なデータなしで、最先端の単眼ベースの方法の検出性能を著しく改善することを示しています。モデルとコードはhttps://github.com/YinminZhang/MonoGeoでリリースされます。

As a crucial task of autonomous driving, 3D object detection has made great progress in recent years. However, monocular 3D object detection remains a challenging problem due to the unsatisfactory performance in depth estimation. Most existing monocular methods typically directly regress the scene depth while ignoring important relationships between the depth and various geometric elements (e.g. bounding box sizes, 3D object dimensions, and object poses). In this paper, we propose to learn geometry-guided depth estimation with projective modeling to advance monocular 3D object detection. Specifically, a principled geometry formula with projective modeling of 2D and 3D depth predictions in the monocular 3D object detection network is devised. We further implement and embed the proposed formula to enable geometry-aware deep representation learning, allowing effective 2D and 3D interactions for boosting the depth estimation. Moreover, we provide a strong baseline through addressing substantial misalignment between 2D annotation and projected boxes to ensure robust learning with the proposed geometric formula. Experiments on the KITTI dataset show that our method remarkably improves the detection performance of the state-of-the-art monocular-based method without extra data by 2.80% on the moderate test setting. The model and code will be released at https://github.com/YinminZhang/MonoGeo.

updated: Wed Apr 24 2024 12:20:54 GMT+0000 (UTC)

published: Thu Jul 29 2021 12:30:39 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト