Full Transformer Framework for Robust Point Cloud Registration with Deep Information Interaction

Guangyan Chen; Meiling Wang; Yufeng Yue; Qingxiang Zhang; Li Yuan

深い情報相互作用を備えたロバストな点群登録のための完全なトランスフレームワーク

最近のTransformerベースの方法は、情報を集約するための順序不変性とモデリング依存性におけるTransformerの利点を利用することにより、ポイントクラウド登録で高度なパフォーマンスを実現しています。ただし、それらは依然として不明瞭な特徴抽出、ノイズに対する感度、および外れ値に悩まされています。その理由は次のとおりです。（1）CNNの採用は、ローカルの受容野のためにグローバルな関係をモデル化できず、ノイズの影響を受けやすい抽出された特徴をもたらします。（2）トランスフォーマーの浅いワイドアーキテクチャと位置エンコーディングの欠如は、非効率的な情報相互作用のために不明瞭な特徴抽出につながります。（3）幾何学的な互換性を省略すると、インライアとアウトライアの分類が不正確になります。上記の制限に対処するために、点群登録用の新しい完全なTransformerネットワークが提案されています。これはDeep Interaction Transformer（DIT）と呼ばれ、次のものが組み込まれています。エンコーダー; （2）トランスフォーマーが包括的な関連付けを確立し、ポイント間の相対位置を直接学習できるように、位置エンコードを使用して2つのポイントクラウド間での深い情報の相互作用を促進するディープナローポイントフィーチャートランスフォーマー（PFT）。（3）三角記述子を設計することにより、空間的一貫性を測定し、インライアの信頼性を推定するための幾何学的マッチングベースの対応信頼性評価（GMCCE）メソッド。クリーンでノイズの多い、部分的にオーバーラップする点群登録に関する広範な実験は、私たちの方法が最先端の方法よりも優れていることを示しています。

Recent Transformer-based methods have achieved advanced performance in point cloud registration by utilizing advantages of the Transformer in order-invariance and modeling dependency to aggregate information. However, they still suffer from indistinct feature extraction, sensitivity to noise, and outliers. The reasons are: (1) the adoption of CNNs fails to model global relations due to their local receptive fields, resulting in extracted features susceptible to noise; (2) the shallow-wide architecture of Transformers and lack of positional encoding lead to indistinct feature extraction due to inefficient information interaction; (3) the omission of geometrical compatibility leads to inaccurate classification between inliers and outliers. To address above limitations, a novel full Transformer network for point cloud registration is proposed, named the Deep Interaction Transformer (DIT), which incorporates: (1) a Point Cloud Structure Extractor (PSE) to model global relations and retrieve structural information with Transformer encoders; (2) a deep-narrow Point Feature Transformer (PFT) to facilitate deep information interaction across two point clouds with positional encoding, such that Transformers can establish comprehensive associations and directly learn relative position between points; (3) a Geometric Matching-based Correspondence Confidence Evaluation (GMCCE) method to measure spatial consistency and estimate inlier confidence by designing the triangulated descriptor. Extensive experiments on clean, noisy, partially overlapping point cloud registration demonstrate that our method outperforms state-of-the-art methods.

updated: Fri Dec 17 2021 08:40:52 GMT+0000 (UTC)

published: Fri Dec 17 2021 08:40:52 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト