Energy-Based Models for Cross-Modal Localization using Convolutional Transformers

Alan Wu; Michael S. Ryoo

畳み込み変換器を使用したクロスモーダル位置特定のためのエネルギーベースのモデル

我々は、GPS がない場合に衛星画像に対して距離センサーを搭載した地上車両の位置を特定するための、エネルギーベースモデル (EBM) を使用した新しいフレームワークを紹介します。 LiDAR センサーは、周囲の環境を説明するために自動運転車に広く普及しています。通常、マップ事前分布は、位置特定の目的で同じセンサーモダリティを使用して構築されます。ただし、距離センサーを使用したこうした地図構築の取り組みは、多くの場合、費用と時間がかかります。あるいは、広く入手可能で簡単にアクセスでき、包括的な範囲を提供する事前地図として衛星画像を活用します。私たちは、クロスモーダル方式で正確なメートルレベルの位置特定を実行する畳み込み変換器を使用する方法を提案します。これは、まばらな距離センサーの読み取り値と豊富な衛星画像の間の外観の大幅な違いにより困難です。私たちはモデルをエンドツーエンドでトレーニングし、KITTI、Pandaset、カスタムデータセットで最先端のものよりも高い精度を達成するアプローチを実証します。

We present a novel framework using Energy-Based Models (EBMs) for localizing a ground vehicle mounted with a range sensor against satellite imagery in the absence of GPS. Lidar sensors have become ubiquitous on autonomous vehicles for describing its surrounding environment. Map priors are typically built using the same sensor modality for localization purposes. However, these map building endeavors using range sensors are often expensive and time-consuming. Alternatively, we leverage the use of satellite images as map priors, which are widely available, easily accessible, and provide comprehensive coverage. We propose a method using convolutional transformers that performs accurate metric-level localization in a cross-modal manner, which is challenging due to the drastic difference in appearance between the sparse range sensor readings and the rich satellite imagery. We train our model end-to-end and demonstrate our approach achieving higher accuracy than the state-of-the-art on KITTI, Pandaset, and a custom dataset.

updated: Tue Jun 06 2023 21:27:08 GMT+0000 (UTC)

published: Tue Jun 06 2023 21:27:08 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト