Efficient Deep Visual and Inertial Odometry with Adaptive Visual Modality Selection

Mingyu Yang; Yu Chen; Hun-Seok Kim

アダプティブビジュアルモダリティ選択による効率的なディープビジュアルオドメトリと慣性オドメトリ

近年、視覚慣性オドメトリ (VIO) のためのディープラーニングベースのアプローチは、従来の幾何学的手法を上回る顕著なパフォーマンスを示しています。しかし、すべての既存の方法は、すべての姿勢推定に視覚的測定と慣性測定の両方を使用するため、計算上の冗長性が生じる可能性があります。視覚データ処理は、慣性計測ユニット (IMU) の処理よりもはるかに高価ですが、姿勢推定精度の向上に常に貢献するとは限りません。この論文では、日和見的に視覚モダリティを無効にすることで計算の冗長性を削減する、適応型深層学習ベースの VIO メソッドを提案します。具体的には、現在のモーション状態と IMU の読み取り値に基づいて、その場で視覚的特徴抽出器を非アクティブ化することを学習するポリシーネットワークをトレーニングします。ポリシーネットワークをトレーニングするために、Gumbel-Softmax トリックが採用され、エンドツーエンドのシステムトレーニングで決定プロセスを微分可能にします。学習した戦略は解釈可能であり、適応型の複雑さを軽減するためのシナリオに依存する決定パターンを示しています。実験結果は、KITTI データセット評価の計算の複雑さを最大 78.8% 削減して、フルモダリティベースラインと同等またはそれ以上のパフォーマンスを達成することを示しています。コードは https://github.com/mingyuyng/Visual-Selective-VIO で入手できます。

In recent years, deep learning-based approaches for visual-inertial odometry (VIO) have shown remarkable performance outperforming traditional geometric methods. Yet, all existing methods use both the visual and inertial measurements for every pose estimation incurring potential computational redundancy. While visual data processing is much more expensive than that for the inertial measurement unit (IMU), it may not always contribute to improving the pose estimation accuracy. In this paper, we propose an adaptive deep-learning based VIO method that reduces computational redundancy by opportunistically disabling the visual modality. Specifically, we train a policy network that learns to deactivate the visual feature extractor on the fly based on the current motion state and IMU readings. A Gumbel-Softmax trick is adopted to train the policy network to make the decision process differentiable for end-to-end system training. The learned strategy is interpretable, and it shows scenario-dependent decision patterns for adaptive complexity reduction. Experiment results show that our method achieves a similar or even better performance than the full-modality baseline with up to 78.8% computational complexity reduction for KITTI dataset evaluation. The code is available at https://github.com/mingyuyng/Visual-Selective-VIO.

updated: Wed Oct 19 2022 18:51:53 GMT+0000 (UTC)

published: Thu May 12 2022 16:17:49 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト