SOGDet: Semantic-Occupancy Guided Multi-view 3D Object Detection

Qiu Zhou; Jinming Cao; Hanchao Leng; Yifang Yin; Yu Kun; Roger Zimmermann

SOGDet: セマンティック占有ガイドによるマルチビュー 3D オブジェクト検出

自動運転の分野では、3D 環境を正確かつ包括的に認識することが重要です。 Bird's Eye View (BEV) ベースの手法は、多視点画像を入力として使用する 3D オブジェクト検出の有望なソリューションとして浮上しています。ただし、既存の 3D オブジェクト検出方法では、歩道や植生などの環境内の物理的コンテキストが無視されることが多く、最適なパフォーマンスが得られません。この論文では、3D セマンティック占有ブランチを利用して 3D オブジェクト検出の精度を向上させる、SOGDet (Semantic-Occupancy Guided Multi-view 3D Object Detection) と呼ばれる新しいアプローチを提案します。特に、意味論的占有によってモデル化された物理的コンテキストは、検出器がより全体的なビューでシーンを認識するのに役立ちます。当社の SOGDet は柔軟に使用でき、ほとんどの既存の BEV ベースの手法とシームレスに統合できます。その有効性を評価するために、このアプローチをいくつかの最先端のベースラインに適用し、独占的な nuScenes データセットで広範な実験を実施します。私たちの結果は、SOGDet が nuScenes 検出スコア (NDS) と平均平均精度 (mAP) の点で 3 つのベースライン手法のパフォーマンスを一貫して向上させていることを示しています。これは、3D オブジェクト検出と 3D セマンティック占有の組み合わせが 3D 環境のより包括的な認識につながり、それによってより堅牢な自動運転システムの構築に役立つことを示しています。コードは https://github.com/zhouqiu/SOGDet で入手できます。

In the field of autonomous driving, accurate and comprehensive perception of the 3D environment is crucial. Bird's Eye View (BEV) based methods have emerged as a promising solution for 3D object detection using multi-view images as input. However, existing 3D object detection methods often ignore the physical context in the environment, such as sidewalk and vegetation, resulting in sub-optimal performance. In this paper, we propose a novel approach called SOGDet (Semantic-Occupancy Guided Multi-view 3D Object Detection), that leverages a 3D semantic-occupancy branch to improve the accuracy of 3D object detection. In particular, the physical context modeled by semantic occupancy helps the detector to perceive the scenes in a more holistic view. Our SOGDet is flexible to use and can be seamlessly integrated with most existing BEV-based methods. To evaluate its effectiveness, we apply this approach to several state-of-the-art baselines and conduct extensive experiments on the exclusive nuScenes dataset. Our results show that SOGDet consistently enhance the performance of three baseline methods in terms of nuScenes Detection Score (NDS) and mean Average Precision (mAP). This indicates that the combination of 3D object detection and 3D semantic occupancy leads to a more comprehensive perception of the 3D environment, thereby aiding build more robust autonomous driving systems. The codes are available at: https://github.com/zhouqiu/SOGDet.

updated: Sat Jan 06 2024 06:19:11 GMT+0000 (UTC)

published: Sat Aug 26 2023 07:38:21 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト