TartanVO: A Generalizable Learning-based VO

Wenshan Wang; Yaoyu Hu; Sebastian Scherer

TartanVO：一般化可能な学習ベースのVO

複数のデータセットと現実世界のシナリオに一般化し、挑戦的なシーンでジオメトリベースの方法よりも優れている最初の学習ベースの視覚オドメトリ（VO）モデルを提示します。これは、困難な環境で大量の多様な合成データを提供するSLAMデータセットTartanAirを活用することで実現します。さらに、VOモデルをデータセット全体で一般化するために、最大スケールの損失関数を提案し、カメラの固有パラメーターをモデルに組み込みます。実験では、微調整なしで合成データのみでトレーニングされた単一のモデルTartanVOを、KITTIやEuRoCなどの実際のデータセットに一般化できることが示され、困難な軌道でのジオメトリベースの方法に比べて大きな利点が示されています。私たちのコードはhttps://github.com/castacks/tartanvoで入手できます。

We present the first learning-based visual odometry (VO) model, which generalizes to multiple datasets and real-world scenarios and outperforms geometry-based methods in challenging scenes. We achieve this by leveraging the SLAM dataset TartanAir, which provides a large amount of diverse synthetic data in challenging environments. Furthermore, to make our VO model generalize across datasets, we propose an up-to-scale loss function and incorporate the camera intrinsic parameters into the model. Experiments show that a single model, TartanVO, trained only on synthetic data, without any finetuning, can be generalized to real-world datasets such as KITTI and EuRoC, demonstrating significant advantages over the geometry-based methods on challenging trajectories. Our code is available at https://github.com/castacks/tartanvo.

updated: Sat Oct 31 2020 20:49:33 GMT+0000 (UTC)

published: Sat Oct 31 2020 20:49:33 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト