FPCC-Net: Fast Point Cloud Clustering for Instance Segmentation

Yajun Xu; Shogo Arai; Diyi Liu; Fangzhou Lin; Kazuhiro Kosuge

FPCC-Net：インスタンスセグメンテーションのための高速点群クラスタリング

インスタンスのセグメンテーションは、ロボット工学、自動運転車、人間とコンピューターの相互作用など、多くの実際のアプリケーションにおける重要な前処理タスクです。ただし、同じクラスの複数のオブジェクトが一緒にスタックされているビンピッキングシーンの3Dポイントクラウドインスタンスセグメンテーションに関する研究はほとんどありません。 2次元（2D）画像タスクの深層学習の急速な発展と比較して、深層学習ベースの3D点群セグメンテーションにはまだ開発の余地がたくさんあります。このような状況では、同じクラスの多数の遮蔽されたオブジェクトを区別することは非常に困難な問題です。通常のビンピッキングシーンでは、オブジェクトモデルがわかっており、オブジェクトタイプの数は1つです。したがって、セマンティック情報は無視できます。代わりに、インスタンスのセグメンテーションに焦点が当てられます。このタスク要件に基づいて、各インスタンスの特徴中心を推測し、残りのポイントを特徴埋め込みスペース内の最も近い特徴中心にクラスター化するネットワーク（FPCC-Net）を提案します。 FPCC-Netには、2つのサブネットが含まれています。1つはクラスタリングの機能センターを推測するためのもので、もう1つは各ポイントの機能を記述するためのものです。提案された方法は、いくつかのビンピッキングシーンで既存の3Dポイントクラウドおよび2Dセグメンテーション方法と比較されます。 FPCC-Netは平均精度（AP）をSGPNよりも約40％向上させ、約0.8 [s]で約60,000ポイントを処理できることが示されています。

Instance segmentation is an important pre-processing task in numerous real-world applications, such as robotics, autonomous vehicles, and human-computer interaction. However, there has been little research on 3D point cloud instance segmentation of bin-picking scenes in which multiple objects of the same class are stacked together. Compared with the rapid development of deep learning for two-dimensional (2D) image tasks, deep learning-based 3D point cloud segmentation still has a lot of room for development. In such a situation, distinguishing a large number of occluded objects of the same class is a highly challenging problem. In a usual bin-picking scene, an object model is known and the number of object type is one. Thus, the semantic information can be ignored; instead, the focus is put on the segmentation of instances. Based on this task requirement, we propose a network (FPCC-Net) that infers feature centers of each instance and then clusters the remaining points to the closest feature center in feature embedding space. FPCC-Net includes two subnets, one for inferring the feature centers for clustering and the other for describing features of each point. The proposed method is compared with existing 3D point cloud and 2D segmentation methods in some bin-picking scenes. It is shown that FPCC-Net improves average precision (AP) by about 40% than SGPN and can process about 60,000 points in about 0.8 [s].

updated: Tue Jan 19 2021 03:10:10 GMT+0000 (UTC)

published: Tue Dec 29 2020 05:58:35 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト