Instant Visual Odometry Initialization for Mobile AR

Alejo Concha; Michael Burri; Jesús Briales; Christian Forster; Luc Oth

モバイルARのインスタントビジュアルオドメトリ初期化

モバイルARアプリケーションは、高速初期化の恩恵を受けて、ワールドロックされた効果を即座に表示します。ただし、標準の視覚オドメトリまたはSLAMアルゴリズムでは、初期化するために運動視差が必要であるため（図1を参照）、初期化が遅れるという問題があります。この論文では、運動視差なしで瞬時に初期化する6自由度の単眼視覚オドメトリを紹介します。私たちの主な貢献は、5-DoFの相対回転と並進方向の推定を1-DoFの並進の大きさから切り離すポーズ推定器です。単眼視のみの設定ではスケールを観察できませんが、AR効果が深度に沿って誤って移動するのを防ぐために、軌道全体にわたって一貫したスケールを推定することが最も重要です（物理的に正確でなくても）。私たちのアプローチでは、回転のみの動作中に深度エラーがユーザーに認識されないという事実を利用します。ただし、ユーザーがデバイスの翻訳を開始すると、奥行きが認識できるようになり、一貫したスケールを推定する機能も認識できるようになります。私たちが提案するアルゴリズムは、これら2つのモード間を自然に移行します。公開されているデータセットと合成データの両方を使用して、貢献の広範な検証を実行します。提案されたポーズ推定器は、低視差構成で文献で使用されている6-DoFポーズ推定の従来のアプローチよりも優れていることを示します。相対ポーズ問題の将来のソリューションとの比較を容易にするために、実際のデータを使用して相対ポーズ問題のデータセットをリリースします。私たちのソリューションは、完全なオドメトリとして、またはInstagramやFacebookなどのプラットフォームでワールドロックされたARエフェクトでサポートされているSLAMシステム（ARKit、ARCore）のpreSLAMコンポーネントとして使用されます。

Mobile AR applications benefit from fast initialization to display world-locked effects instantly. However, standard visual odometry or SLAM algorithms require motion parallax to initialize (see Figure 1) and, therefore, suffer from delayed initialization. In this paper, we present a 6-DoF monocular visual odometry that initializes instantly and without motion parallax. Our main contribution is a pose estimator that decouples estimating the 5-DoF relative rotation and translation direction from the 1-DoF translation magnitude. While scale is not observable in a monocular vision-only setting, it is still paramount to estimate a consistent scale over the whole trajectory (even if not physically accurate) to avoid AR effects moving erroneously along depth. In our approach, we leverage the fact that depth errors are not perceivable to the user during rotation-only motion. However, as the user starts translating the device, depth becomes perceivable and so does the capability to estimate consistent scale. Our proposed algorithm naturally transitions between these two modes. We perform extensive validations of our contributions with both a publicly available dataset and synthetic data. We show that the proposed pose estimator outperforms the classical approaches for 6-DoF pose estimation used in the literature in low-parallax configurations. We release a dataset for the relative pose problem using real data to facilitate the comparison with future solutions for the relative pose problem. Our solution is either used as a full odometry or as a preSLAM component of any supported SLAM system (ARKit, ARCore) in world-locked AR effects on platforms such as Instagram and Facebook.

updated: Fri Jul 30 2021 14:25:40 GMT+0000 (UTC)

published: Fri Jul 30 2021 14:25:40 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト