SeMLaPS: Real-time Semantic Mapping with Latent Prior Networks and Quasi-Planar Segmentation

Jingwen Wang; Juan Tarrio; Lourdes Agapito; Pablo F. Alcantarilla; Alexander Vakhitov

SeMLaPS: 潜在事前ネットワークと準平面セグメンテーションを使用したリアルタイムセマンティックマッピング

リアルタイムセマンティクスの可用性により、SLAM システムの中核となる幾何学的機能が大幅に向上し、多数のロボットおよび AR/VR アプリケーションが可能になります。我々は、2D ニューラルネットワークと、3D 占有マッピングを備えた SLAM システムに基づく 3D ネットワークを組み合わせた、RGB-D シーケンスからのリアルタイムセマンティックマッピングの新しい方法論を提案します。新しいフレームをセグメント化するとき、微分可能レンダリングに基づいて前のフレームから潜在特徴の再投影を実行します。以前のフレームから再投影された特徴マップと現在のフレームの特徴を融合すると、画像を個別に処理するベースラインと比較して、画像セグメンテーションの品質が大幅に向上します。 3D マップ処理については、表面法線に依存して、同じ意味クラスに属する可能性が高い 3D マップ要素をグループ化する、新しい幾何学的な準平面オーバーセグメンテーション方法を提案します。また、軽量のセマンティックマップの後処理のための新しいニューラルネットワーク設計についても説明します。当社のシステムは、2D-3D ネットワークベースのシステム内で最先端のセマンティックマッピング品質を実現し、リアルタイムで動作しながら、3 つの実際の屋内データセット上で 3D 畳み込みネットワークのパフォーマンスと同等のパフォーマンスを実現します。さらに、3D CNN と比較して優れたクロスセンサー汎化能力を示し、さまざまな深度センサーを使用したトレーニングと推論が可能になります。コードとデータはプロジェクトページで公開されます: http://jingwenwang95.github.io/SeMLaPS

The availability of real-time semantics greatly improves the core geometric functionality of SLAM systems, enabling numerous robotic and AR/VR applications. We present a new methodology for real-time semantic mapping from RGB-D sequences that combines a 2D neural network and a 3D network based on a SLAM system with 3D occupancy mapping. When segmenting a new frame we perform latent feature re-projection from previous frames based on differentiable rendering. Fusing re-projected feature maps from previous frames with current-frame features greatly improves image segmentation quality, compared to a baseline that processes images independently. For 3D map processing, we propose a novel geometric quasi-planar over-segmentation method that groups 3D map elements likely to belong to the same semantic classes, relying on surface normals. We also describe a novel neural network design for lightweight semantic map post-processing. Our system achieves state-of-the-art semantic mapping quality within 2D-3D networks-based systems and matches the performance of 3D convolutional networks on three real indoor datasets, while working in real-time. Moreover, it shows better cross-sensor generalization abilities compared to 3D CNNs, enabling training and inference with different depth sensors. Code and data will be released on project page: http://jingwenwang95.github.io/SeMLaPS

updated: Wed Jun 28 2023 22:36:44 GMT+0000 (UTC)

published: Wed Jun 28 2023 22:36:44 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト