YOLOStereo3D: A Step Back to 2D for Efficient Stereo 3D Detection

Yuxuan Liu; Lujia Wang; Ming Liu

YOLOStereo3D：効率的なステレオ3D検出のための2Dへのステップバック

ステレオカメラを使用した3Dでの物体検出は、コンピュータービジョンの重要な問題であり、LiDARを使用しない低コストの自律移動ロボットでは特に重要です。現在、ステレオ3Dオブジェクト検出に最適なフレームワークのほとんどは、視差推定からの高密度深度再構成に基づいているため、計算コストが非常に高くなります。双眼画像を使用した視覚検出の実際の展開を可能にするために、2D画像ベースの検出フレームワークから洞察を得て、ステレオ機能でそれらを強化するために一歩後退します。リアルタイムの1ステージ2D / 3Dオブジェクト検出器からの知識と推論構造を組み込み、軽量のステレオマッチングモジュールを紹介します。提案されたフレームワークYOLOStereo3Dは、単一のGPUでトレーニングされ、10fps以上で実行されます。 LiDARデータを使用せずに、最先端のステレオ3D検出フレームワークに匹敵するパフォーマンスを示します。コードはhttps://github.com/Owen-Liuyuxuan/visualDet3Dで公開されます。

Object detection in 3D with stereo cameras is an important problem in computer vision, and is particularly crucial in low-cost autonomous mobile robots without LiDARs. Nowadays, most of the best-performing frameworks for stereo 3D object detection are based on dense depth reconstruction from disparity estimation, making them extremely computationally expensive. To enable real-world deployments of vision detection with binocular images, we take a step back to gain insights from 2D image-based detection frameworks and enhance them with stereo features. We incorporate knowledge and the inference structure from real-time one-stage 2D/3D object detector and introduce a light-weight stereo matching module. Our proposed framework, YOLOStereo3D, is trained on one single GPU and runs at more than ten fps. It demonstrates performance comparable to state-of-the-art stereo 3D detection frameworks without usage of LiDAR data. The code will be published in https://github.com/Owen-Liuyuxuan/visualDet3D.

updated: Wed Mar 17 2021 03:43:54 GMT+0000 (UTC)

published: Wed Mar 17 2021 03:43:54 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト