Two Heads are Better than One: Geometric-Latent Attention for Point Cloud Classification and Segmentation

Hanz Cuevas-Velasquez; Antonio Javier Gallego; Robert B. Fisher

2つの頭が1つよりも優れている：点群の分類とセグメンテーションに対する幾何学的な潜在的注意

幾何学的特徴と潜在的特徴を組み合わせて3Dシーンを意味的に意味のあるサブセットにセグメント化する、革新的な両頭注意レイヤーを紹介します。各ヘッドは、ポイントの近隣の幾何学的または潜在的な特徴を使用して、ローカル情報とグローバル情報を組み合わせ、この情報を使用してより良いローカル関係を学習します。このGeometric-Latentアテンションレイヤー（Ge-Latto）は、サブサンプリング戦略と組み合わされて、グローバルな特徴をキャプチャします。私たちの方法は、共有MLP層を使用しているため、順列に対して不変です。また、ローカルアテンション層は隣接する順序に依存しないため、密度が変化する点群でも使用できます。私たちの提案はシンプルでありながら堅牢であり、ShapeNetPartおよびModelNet40データセットで競争力のある結果を達成し、複雑なデータセットS3DISをセグメント化する際に、エリア5で69.2％IoU、全体的な精度で89.7％の最先端を実現します。 6つの領域でK分割交差検定を使用します。

We present an innovative two-headed attention layer that combines geometric and latent features to segment a 3D scene into semantically meaningful subsets. Each head combines local and global information, using either the geometric or latent features, of a neighborhood of points and uses this information to learn better local relationships. This Geometric-Latent attention layer (Ge-Latto) is combined with a sub-sampling strategy to capture global features. Our method is invariant to permutation thanks to the use of shared-MLP layers, and it can also be used with point clouds with varying densities because the local attention layer does not depend on the neighbor order. Our proposal is simple yet robust, which allows it to achieve competitive results in the ShapeNetPart and ModelNet40 datasets, and the state-of-the-art when segmenting the complex dataset S3DIS, with 69.2% IoU on Area 5, and 89.7% overall accuracy using K-fold cross-validation on the 6 areas.

updated: Sat Oct 30 2021 11:20:56 GMT+0000 (UTC)

published: Sat Oct 30 2021 11:20:56 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト