Spatially Invariant Unsupervised 3D Object Segmentation with Graph Neural Networks

Tianyu Wang; Miaomiao Liu; Kee Siong Ng

グラフニューラルネットワークによる空間的に不変の教師なし3Dオブジェクトセグメンテーション

この論文では、RGB情報のない点群からの教師なし3Dオブジェクトセグメンテーションの問題に取り組みます。特に、点群を空間混合モデルとしてモデル化し、Variational Autoencoders（VAE）を介して3Dでの複数オブジェクトの表現とセグメンテーションを共同で学習するためのフレームワークSPAIR3Dを提案します。 SPAIRに触発されて、点群全体ではなく、ローカルボクセルグリッドセルに対する各オブジェクトの位置を記述するオブジェクト指定スキームを採用しています。点群で空間混合モデルをモデル化するために、変分トレーニングパイプラインに自然に適合する面取り尤度を導出します。さらに、VAE内のデコーダーとしてさまざまな数の3Dポイントを生成するために、新しい空間的に不変のグラフニューラルネットワークを設計します。実験結果は、SPAIR3Dが、さまざまなシーンにわたる外観情報なしで、可変数のオブジェクトを検出およびセグメント化できることを示しています。

In this paper, we tackle the problem of unsupervised 3D object segmentation from a point cloud without RGB information. In particular, we propose a framework, SPAIR3D, to model a point cloud as a spatial mixture model and jointly learn the multiple-object representation and segmentation in 3D via Variational Autoencoders (VAE). Inspired by SPAIR, we adopt an object-specification scheme that describes each object's location relative to its local voxel grid cell rather than the point cloud as a whole. To model the spatial mixture model on point clouds, we derive the Chamfer Likelihood, which fits naturally into the variational training pipeline. We further design a new spatially invariant graph neural network to generate a varying number of 3D points as a decoder within our VAE. Experimental results demonstrate that SPAIR3D is capable of detecting and segmenting variable number of objects without appearance information across diverse scenes.

updated: Fri Jun 11 2021 12:07:16 GMT+0000 (UTC)

published: Thu Jun 10 2021 09:20:16 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト