Radar Voxel Fusion for 3D Object Detection

Felix Nobis; Ehsan Shafiei; Phillip Karle; Johannes Betz; Markus Lienkamp

3Dオブジェクト検出のためのレーダーボクセルフュージョン

自動車の交通シーンは、処理する必要のあるさまざまなシナリオ、オブジェクト、および気象条件のために複雑です。自動化された地下列車などのより制約のある環境とは対照的に、自動車の知覚システムは特定のタスクの狭い分野に合わせて調整することはできませんが、予期しないイベントを伴う絶えず変化する環境を処理する必要があります。現在、単一のセンサーが周囲のすべての関連する活動を確実に認識できるわけではないため、センサーデータ融合を適用して可能な限り多くの情報を認識します。低抽象化レベルでのさまざまなセンサーとセンサーモダリティのデータ融合により、情報が豊富なセンサーデータが圧縮される前に、センサーの弱点とセンサー間の誤検出を補正できるため、センサーの個々のオブジェクトの検出後に情報が失われます。この論文では、LIDAR、カメラ、レーダーのデータを融合する3Dオブジェクト検出用の低レベルセンサー融合ネットワークを開発します。フュージョンネットワークは、nuScenesデータセットでトレーニングおよび評価されます。テストセットでは、レーダーデータの融合により、ベースラインのLIDARネットワークと比較して、結果のAP（平均精度）検出スコアが約5.1％増加します。レーダーセンサーの融合は、雨や夜のシーンなどの厳しい条件で特に有益であることが証明されています。追加のカメラデータの融合は、レーダー融合との組み合わせでのみ積極的に貢献します。これは、センサーの相互依存性が検出結果にとって重要であることを示しています。さらに、この論文は、物体検出のための単純なヨー表現の不連続性を処理するための新しい損失を提案しています。更新された損失により、すべてのセンサー入力構成の検出および方向推定のパフォーマンスが向上します。この調査のコードはGitHubで利用できるようになりました。

Automotive traffic scenes are complex due to the variety of possible scenarios, objects, and weather conditions that need to be handled. In contrast to more constrained environments, such as automated underground trains, automotive perception systems cannot be tailored to a narrow field of specific tasks but must handle an ever-changing environment with unforeseen events. As currently no single sensor is able to reliably perceive all relevant activity in the surroundings, sensor data fusion is applied to perceive as much information as possible. Data fusion of different sensors and sensor modalities on a low abstraction level enables the compensation of sensor weaknesses and misdetections among the sensors before the information-rich sensor data are compressed and thereby information is lost after a sensor-individual object detection. This paper develops a low-level sensor fusion network for 3D object detection, which fuses lidar, camera, and radar data. The fusion network is trained and evaluated on the nuScenes data set. On the test set, fusion of radar data increases the resulting AP (Average Precision) detection score by about 5.1% in comparison to the baseline lidar network. The radar sensor fusion proves especially beneficial in inclement conditions such as rain and night scenes. Fusing additional camera data contributes positively only in conjunction with the radar fusion, which shows that interdependencies of the sensors are important for the detection result. Additionally, the paper proposes a novel loss to handle the discontinuity of a simple yaw representation for object detection. Our updated loss increases the detection and orientation estimation performance for all sensor input configurations. The code for this research has been made available on GitHub.

updated: Sat Jun 26 2021 20:34:12 GMT+0000 (UTC)

published: Sat Jun 26 2021 20:34:12 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト