What My Motion tells me about Your Pose: A Self-Supervised Monocular 3D Vehicle Detector

Cédric Picron; Punarjay Chakravarty; Tom Roussel; Tinne Tuytelaars

私の動きがあなたのポーズについて教えてくれること：自己監視型単眼3D車両検出器

単眼カメラデータから自動運転車（AV）に対する観測車両の向きを推定することは、6DoFポーズを推定する際の重要な構成要素です。この観測された車両の周囲に3Dバウンディングボックスを配置するための現在のディープラーニングベースのソリューションは、データを大量に消費し、一般化されていません。この論文では、参照領域で事前に訓練された方向推定のためのモデルの自己教師あり微調整のための単眼視覚オドメトリの使用を示します。具体的には、仮想データセット（vKITTI）からnuScenesに移行する際に、完全に監視されたメソッドのパフォーマンスの最大70％を回復します。続いて、高価なラベル付きデータを必要とせずに、自己監視型車両方向推定器の上に構築された最適化ベースの単眼3Dバウンディングボックス検出器を示します。これにより、3D車両検出アルゴリズムを、既存の商用車フリートからの大量の単眼カメラデータから自己トレーニングすることができます。

The estimation of the orientation of an observed vehicle relative to an Autonomous Vehicle (AV) from monocular camera data is an important building block in estimating its 6 DoF pose. Current Deep Learning based solutions for placing a 3D bounding box around this observed vehicle are data hungry and do not generalize well. In this paper, we demonstrate the use of monocular visual odometry for the self-supervised fine-tuning of a model for orientation estimation pre-trained on a reference domain. Specifically, while transitioning from a virtual dataset (vKITTI) to nuScenes, we recover up to 70% of the performance of a fully supervised method. We subsequently demonstrate an optimization-based monocular 3D bounding box detector built on top of the self-supervised vehicle orientation estimator without the requirement of expensive labeled data. This allows 3D vehicle detection algorithms to be self-trained from large amounts of monocular camera data from existing commercial vehicle fleets.

updated: Wed Mar 24 2021 18:11:37 GMT+0000 (UTC)

published: Wed Jul 29 2020 12:58:40 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト