ESLAM: Efficient Dense SLAM System Based on Hybrid Representation of Signed Distance Fields

Mohammad Mahdi Johari; Camilla Carta; François Fleuret

ESLAM: 符号付き距離場のハイブリッド表現に基づく効率的な高密度 SLAM システム

Simultaneous Localization and Mapping (SLAM) の効率的な暗黙的ニューラル表現法である ESLAM を紹介します。 ESLAM は、未知のカメラポーズを含む RGB-D フレームを順次読み取り、シーン内の現在のカメラ位置を推定しながらシーン表現を段階的に再構築します。 Neural Radiance Fields (NeRF) の最新の進歩を SLAM システムに組み込み、効率的で正確な高密度視覚 SLAM メソッドを実現します。私たちのシーン表現は、マルチスケールの軸に沿った垂直フィーチャプレーンと、連続空間内の各ポイントについて、補間されたフィーチャを切り捨て符号付き距離フィールド (TSDF) と RGB 値にデコードする浅いデコーダで構成されます。 Replica と ScanNet の 2 つの標準データセットと最近のデータセットに関する広範な実験では、ESLAM が最先端の高密度ビジュアル SLAM 手法の 3D 再構成とカメラローカリゼーションの精度を 50% 以上向上させ、最大 × 10 高速で、事前トレーニングは必要ありません。

We present ESLAM, an efficient implicit neural representation method for Simultaneous Localization and Mapping (SLAM). ESLAM reads RGB-D frames with unknown camera poses in a sequential manner and incrementally reconstructs the scene representation while estimating the current camera position in the scene. We incorporate the latest advances in Neural Radiance Fields (NeRF) into a SLAM system, resulting in an efficient and accurate dense visual SLAM method. Our scene representation consists of multi-scale axis-aligned perpendicular feature planes and shallow decoders that, for each point in the continuous space, decode the interpolated features into Truncated Signed Distance Field (TSDF) and RGB values. Our extensive experiments on two standard and recent datasets, Replica and ScanNet, show that ESLAM improves the accuracy of 3D reconstruction and camera localization of state-of-the-art dense visual SLAM methods by more than 50%, while it runs up to ×10 faster and does not require any pre-training.

updated: Mon Nov 21 2022 18:25:14 GMT+0000 (UTC)

published: Mon Nov 21 2022 18:25:14 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト