Recovery of Superquadrics from Range Images using Deep Learning: A Preliminary Study

Tim Oblak; Klemen Grm; Aleš Jaklič; Peter Peer; Vitomir Štruc; Franc Solina

ディープラーニングを使用した距離画像からの超二次関数の復元：予備調査

自律型マシンが周囲を理解して相互作用できるようにするパラメーター化された体積モデルの観点から3D物理空間を説明することは、コンピュータービジョンにおける長年の目標でした。このようなモデルは、通常、人間の視覚認識によって動機付けられ、小さなパラメータセットを使用して、個々のオブジェクトから複雑なシーンに至るまでの物理的な単語のすべての要素を表すことを目的としています。この問題に取り組むための事実上の標準の1つは、超二次関数-さまざまな3D形状プリミティブを定義し、実際の3Dデータ（点群または距離画像のいずれかの形式）に適合できる体積モデルです。ただし、超二次回復の既存のソリューションには、費用のかかる反復フィッティング手順が含まれ、そのような手法の実際の適用性が制限されます。この問題を緩和するために、この論文では、現代のディープラーニングモデル、より具体的には畳み込みニューラルネットワーク（CNN）を使用して、時間のかかる反復パラメーター推定手法なしで距離画像から超二次関数を復元する可能性を探ります。超二次回復問題を回帰タスクとして提起し、所定の距離画像から超二次モデルのパラメーターを推定できるCNNリグレッサを開発します。それぞれが単一の（回転されていない）超二次形状を含む合成範囲画像の大規模なセットでリグレッサをトレーニングし、現在の最先端技術との比較実験で学習モデルを評価します。さらに、実際のオブジェクトのデータセットを含む定性分析も示します。私たちの実験の結果は、提案されたリグレッサが既存の最先端技術を上回るだけでなく、270倍高速な実行時間を保証することを示しています。

It has been a longstanding goal in computer vision to describe the 3D physical space in terms of parameterized volumetric models that would allow autonomous machines to understand and interact with their surroundings. Such models are typically motivated by human visual perception and aim to represents all elements of the physical word ranging from individual objects to complex scenes using a small set of parameters. One of the de facto stadards to approach this problem are superquadrics - volumetric models that define various 3D shape primitives and can be fitted to actual 3D data (either in the form of point clouds or range images). However, existing solutions to superquadric recovery involve costly iterative fitting procedures, which limit the applicability of such techniques in practice. To alleviate this problem, we explore in this paper the possibility to recover superquadrics from range images without time consuming iterative parameter estimation techniques by using contemporary deep-learning models, more specifically, convolutional neural networks (CNNs). We pose the superquadric recovery problem as a regression task and develop a CNN regressor that is able to estimate the parameters of a superquadric model from a given range image. We train the regressor on a large set of synthetic range images, each containing a single (unrotated) superquadric shape and evaluate the learned model in comparaitve experiments with the current state-of-the-art. Additionally, we also present a qualitative analysis involving a dataset of real-world objects. The results of our experiments show that the proposed regressor not only outperforms the existing state-of-the-art, but also ensures a 270x faster execution time.

updated: Tue Jul 28 2020 15:22:17 GMT+0000 (UTC)

published: Sat Apr 13 2019 19:01:11 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト