Multi-Modal 3D Object Detection in Autonomous Driving: a Survey

Yingjie Wang; Qiuyu Mao; Hanqi Zhu; Yu Zhang; Jianmin Ji; Yanyong Zhang

自動運転におけるマルチモーダル3Dオブジェクト検出：調査

過去数年間で、自動運転の急速な発展を目の当たりにしてきました。しかし、複雑でダイナミックな運転環境のため、完全な自律性を達成することは依然として困難な作業です。その結果、自動運転車には、堅牢で正確な環境認識を行うための一連のセンサーが装備されています。センサーの数と種類が増え続けるにつれて、より良い知覚のためにそれらを組み合わせることが自然な傾向になりつつあります。これまでのところ、マルチセンサーフュージョンベースの知覚に焦点を当てた詳細なレビューはありません。このギャップを埋め、将来の研究を動機付けるために、この調査では、複数のセンサーデータソース、特にカメラとLiDARを活用する最近の融合ベースの3D検出ディープラーニングモデルを確認します。この調査では、まず、自動運転車に人気のあるセンサーの背景を紹介します。これには、一般的なデータ表現や、センサーデータの種類ごとに開発されたオブジェクト検出ネットワークが含まれます。次に、各データセットに含まれるセンサーデータに特に焦点を当てて、マルチモーダル3Dオブジェクト検出用のいくつかの一般的なデータセットについて説明します。次に、フュージョンの次の3つの側面、フュージョンの場所、フュージョンデータの表現、フュージョンの粒度を考慮して、最近のマルチモーダル3D検出ネットワークの詳細なレビューを示します。詳細なレビューの後、未解決の課題について話し合い、考えられる解決策を指摘します。私たちの詳細なレビューが、研究者がマルチモーダル3Dオブジェクト検出の分野で調査に着手するのに役立つことを願っています。

In the past few years, we have witnessed rapid development of autonomous driving. However, achieving full autonomy remains a daunting task due to the complex and dynamic driving environment. As a result, self-driving cars are equipped with a suite of sensors to conduct robust and accurate environment perception. As the number and type of sensors keep increasing, combining them for better perception is becoming a natural trend. So far, there has been no indepth review that focuses on multi-sensor fusion based perception. To bridge this gap and motivate future research, this survey devotes to review recent fusion-based 3D detection deep learning models that leverage multiple sensor data sources, especially cameras and LiDARs. In this survey, we first introduce the background of popular sensors for autonomous cars, including their common data representations as well as object detection networks developed for each type of sensor data. Next, we discuss some popular datasets for multi-modal 3D object detection, with a special focus on the sensor data included in each dataset. Then we present in-depth reviews of recent multi-modal 3D detection networks by considering the following three aspects of the fusion: fusion location, fusion data representation, and fusion granularity. After a detailed review, we discuss open challenges and point out possible solutions. We hope that our detailed review can help researchers to embark investigations in the area of multi-modal 3D object detection.

updated: Fri Jun 25 2021 15:39:13 GMT+0000 (UTC)

published: Thu Jun 24 2021 02:52:12 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト