Joint Monocular 3D Vehicle Detection and Tracking

Hou-Ning Hu; Qi-Zhi Cai; Dequan Wang; Ji Lin; Min Sun; Philipp Krähenbühl; Trevor Darrell; Fisher Yu

ジョイント単眼3D車両の検出と追跡

車両の3D範囲と軌道は、車両の将来の位置を予測し、それらの予測に基づいて将来のエージェントのエゴモーションを計画するための重要なキューです。本稿では、単眼ビデオからの3D車両検出および追跡のための新しいオンラインフレームワークを提案します。このフレームワークは、時間の経過とともに動いている車両の検出を関連付けるだけでなく、移動プラットフォームでキャプチャされた一連の2D画像から完全な3Dバウンディングボックス情報を推定することもできます。私たちの方法は、堅牢なインスタンスの関連付けのために3Dボックスの深さ順序マッチングを活用し、閉塞車両の再識別のために3D軌道予測を利用します。また、より正確な長期のモーション外挿のために、LSTMに基づいたモーション学習モジュールも設計しています。シミュレーション、KITTI、およびArgoverseデータセットに関する実験では、3D追跡パイプラインが堅牢なデータの関連付けと追跡を提供することを示しています。 Argoverseでは、画像ベースの方法は、LiDAR中心のベースライン方法よりも30メートル以内の3D車両を追跡するのに非常に優れています。

Vehicle 3D extents and trajectories are critical cues for predicting the future location of vehicles and planning future agent ego-motion based on those predictions. In this paper, we propose a novel online framework for 3D vehicle detection and tracking from monocular videos. The framework can not only associate detections of vehicles in motion over time, but also estimate their complete 3D bounding box information from a sequence of 2D images captured on a moving platform. Our method leverages 3D box depth-ordering matching for robust instance association and utilizes 3D trajectory prediction for re-identification of occluded vehicles. We also design a motion learning module based on an LSTM for more accurate long-term motion extrapolation. Our experiments on simulation, KITTI, and Argoverse datasets show that our 3D tracking pipeline offers robust data association and tracking. On Argoverse, our image-based method is significantly better for tracking 3D vehicles within 30 meters than the LiDAR-centric baseline methods.

updated: Thu Sep 12 2019 08:50:53 GMT+0000 (UTC)

published: Mon Nov 26 2018 23:29:46 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト