Learning Generalized Visual Odometry Using Position-Aware Optical Flow and Geometric Bundle Adjustment

Yijun Cao; Xianshi Zhang; Fuya Luo; Peng Peng; Yongjie Li

位置認識型オプティカルフローとジオメトリックバンドル調整を使用した一般化されたビジュアルオドメトリの学習

深層学習アーキテクチャに幾何学的アルゴリズムを組み込んだ最近のビジュアルオドメトリ (VO) メソッドは、挑戦的な単眼 VO タスクで優れたパフォーマンスを示しています。有望な結果が示されているにもかかわらず、以前の方法では、ノイズの多い環境やさまざまなシーンでの一般化機能の要件が無視されていました。この困難な問題に対処するために、この作業では最初に新しいオプティカルフローネットワーク (PANet) を提案します。オプティカルフローを直接回帰タスクとして予測する以前の方法と比較して、当社の PANet は、オプティカルフロー確率ボリュームを使用して離散位置空間にオプティカルフローを予測し、それをオプティカルフローに変換することによってオプティカルフローを計算します。次に、複数のサンプリング、エゴモーションの初期化、動的減衰係数の調整、およびヤコビ行列の重み付けを導入することにより、自己教師ありトレーニングパイプラインに適合するようにバンドル調整モジュールを改善します。さらに、深度推定の精度を向上させるために、新しい正規化された測光損失関数が拡張されています。実験は、提案されたシステムが、KITTI データセットに対する他の最先端の自己教師あり学習ベースの方法と同等のパフォーマンスを達成するだけでなく、ジオメトリベース、学習ベース、およびノイズの多い KITTI とチャレンジングなアウトドア (KAIST) シーンでのハイブリッド VO システム。

Recent visual odometry (VO) methods incorporating geometric algorithm into deep-learning architecture have shown outstanding performance on the challenging monocular VO task. Despite encouraging results are shown, previous methods ignore the requirement of generalization capability under noisy environment and various scenes. To address this challenging issue, this work first proposes a novel optical flow network (PANet). Compared with previous methods that predict optical flow as a direct regression task, our PANet computes optical flow by predicting it into the discrete position space with optical flow probability volume, and then converting it to optical flow. Next, we improve the bundle adjustment module to fit the self-supervised training pipeline by introducing multiple sampling, ego-motion initialization, dynamic damping factor adjustment, and Jacobi matrix weighting. In addition, a novel normalized photometric loss function is advanced to improve the depth estimation accuracy. The experiments show that the proposed system not only achieves comparable performance with other state-of-the-art self-supervised learning-based methods on the KITTI dataset, but also significantly improves the generalization capability compared with geometry-based, learning-based and hybrid VO systems on the noisy KITTI and the challenging outdoor (KAIST) scenes.

updated: Wed Dec 21 2022 11:42:51 GMT+0000 (UTC)

published: Mon Nov 22 2021 12:05:27 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト