MDS-Net: A Multi-scale Depth Stratification Based Monocular 3D Object Detection Algorithm

Zhouzhen Xie; Yuying Song; Jingxuan Wu; Zecheng Li; Chunyi Song; Zhiwei Xu

MDS-Net：マルチスケール深度層別化ベースの単眼3Dオブジェクト検出アルゴリズム

単眼3Dオブジェクトの検出は、深度情報が不足しているため、自動運転では非常に困難です。本論文では、マルチスケール深度層化に基づく一段単眼3Dオブジェクト検出アルゴリズムを提案します。これは、アンカーフリー法を使用して、ピクセルごとの予測で3Dオブジェクトを検出します。提案されたMDS-Netでは、オブジェクトの深度と画像サイズの間に数学モデルを確立することにより、ネットワークの深度予測能力を向上させるために、新しい深度ベースの層化構造が開発されています。次に、新しい角度損失関数が開発され、角度予測の精度がさらに向上し、トレーニングの収束速度が向上します。最適化されたsoft-NMSは、候補ボックスの信頼性を調整するために、最終的に後処理段階で適用されます。 KITTIベンチマークでの実験は、MDS-Netが、リアルタイム要件を満たしながら、3D検出およびBEV検出タスクで既存の単眼3D検出方法よりも優れていることを示しています。

Monocular 3D object detection is very challenging in autonomous driving due to the lack of depth information. This paper proposes a one-stage monocular 3D object detection algorithm based on multi-scale depth stratification, which uses the anchor-free method to detect 3D objects in a per-pixel prediction. In the proposed MDS-Net, a novel depth-based stratification structure is developed to improve the network's ability of depth prediction by establishing mathematical models between depth and image size of objects. A new angle loss function is then developed to further improve the accuracy of the angle prediction and increase the convergence speed of training. An optimized soft-NMS is finally applied in the post-processing stage to adjust the confidence of candidate boxes. Experiments on the KITTI benchmark show that the MDS-Net outperforms the existing monocular 3D detection methods in 3D detection and BEV detection tasks while fulfilling real-time requirements.

updated: Thu Apr 28 2022 14:31:39 GMT+0000 (UTC)

published: Wed Jan 12 2022 07:11:18 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト