Improving RGB-D Point Cloud Registration by Learning Multi-scale Local Linear Transformation

Ziming Wang; Xiaoliang Huo; Zhenghao Chen; Jing Zhang; Lu Sheng; Dong Xu

マルチスケールローカル線形変換の学習による RGB-D ポイントクラウドレジストレーションの改善

点群登録は、2 つの点群スキャン間の幾何学的変換を推定することを目的としています。この場合、点ごとの対応推定が成功の鍵となります。手動または学習された幾何学的特徴によって対応を求める以前の方法に加えて、最近の点群登録方法は、より正確な対応を達成するために RGB-D データを適用しようとしました。ただし、特に位置合わせの問題では、これら 2 つの特徴的なモダリティからの幾何学的情報と視覚的情報を効果的に融合することは自明ではありません。この作業では、マルチスケールのローカル線形変換を使用してこれら 2 つのモダリティを徐々に融合する新しい Geometry-Aware Visual Feature Extractor (GAVE) を提案します。ここで、深度データからの幾何学的特徴は、変換するジオメトリ依存の畳み込みカーネルとして機能します。 RGB データからの視覚的特徴。結果として得られる視覚的幾何学的特徴は、幾何学的変化によって引き起こされる視覚的相違が緩和された正準特徴空間にあり、これにより、より信頼性の高い対応を実現できます。提案された GAVE モジュールは、最近の RGB-D ポイントクラウド登録フレームワークに簡単にプラグインできます。 3D Match と ScanNet に関する広範な実験により、対応やポーズの監視がなくても、最先端の点群登録方法よりも優れたパフォーマンスが得られることが実証されています。コードは https://github.com/514DNA/LLT で入手できます。

Point cloud registration aims at estimating the geometric transformation between two point cloud scans, in which point-wise correspondence estimation is the key to its success. In addition to previous methods that seek correspondences by hand-crafted or learnt geometric features, recent point cloud registration methods have tried to apply RGB-D data to achieve more accurate correspondence. However, it is not trivial to effectively fuse the geometric and visual information from these two distinctive modalities, especially for the registration problem. In this work, we propose a new Geometry-Aware Visual Feature Extractor (GAVE) that employs multi-scale local linear transformation to progressively fuse these two modalities, where the geometric features from the depth data act as the geometry-dependent convolution kernels to transform the visual features from the RGB data. The resultant visual-geometric features are in canonical feature spaces with alleviated visual dissimilarity caused by geometric changes, by which more reliable correspondence can be achieved. The proposed GAVE module can be readily plugged into recent RGB-D point cloud registration framework. Extensive experiments on 3D Match and ScanNet demonstrate that our method outperforms the state-of-the-art point cloud registration methods even without correspondence or pose supervision. The code is available at: https://github.com/514DNA/LLT.

updated: Wed Aug 31 2022 14:36:09 GMT+0000 (UTC)

published: Wed Aug 31 2022 14:36:09 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト