Indoor Scene Generation from a Collection of Semantic-Segmented Depth Images

Ming-Jia Yang; Yu-Xiao Guo; Bin Zhou; Xin Tong

セマンティックセグメント化された深度画像のコレクションからの屋内シーンの生成

さまざまな未知のシーンからキャプチャされたセマンティックセグメント化された深度画像のコレクションから学習された生成モデルを使用して、3D屋内シーンを作成する方法を示します。指定されたサイズの部屋が与えられると、このメソッドはランダムにサンプリングされた潜在コードから部屋に3Dオブジェクトを自動的に生成します。部屋内のオブジェクトのタイプ、場所、およびその他のプロパティで屋内シーンを表し、完全な3D屋内シーンのコレクションからシーンレイアウトを学習する既存の方法とは異なり、この方法では、各屋内シーンを3Dセマンティックシーンボリュームとしてモデル化します。 3Dシーンの2.5D部分観測のコレクションから、体積生成敵対的ネットワーク（GAN）を学習します。この目的のために、微分可能な投影層を適用して、生成された3Dセマンティックシーンボリュームをセマンティックセグメント化された深度画像に投影し、2.5Dセマンティックセグメント化された深度画像から完全な3Dシーンボリュームを学習するための新しいマルチビューディスクリミネーターを設計します。既存の方法と比較して、私たちの方法は、トレーニングのための3Dシーンのモデリングと取得の作業負荷を効率的に削減するだけでなく、シーン内のより良いオブジェクト形状とその詳細なレイアウトを生成します。さまざまな屋内シーンデータセットを使用してメソッドを評価し、メソッドの利点を示します。また、実際のシーンのRGB画像から推測されたセマンティックセグメント化された深度画像から3D屋内シーンを生成する方法を拡張します。

We present a method for creating 3D indoor scenes with a generative model learned from a collection of semantic-segmented depth images captured from different unknown scenes. Given a room with a specified size, our method automatically generates 3D objects in a room from a randomly sampled latent code. Different from existing methods that represent an indoor scene with the type, location, and other properties of objects in the room and learn the scene layout from a collection of complete 3D indoor scenes, our method models each indoor scene as a 3D semantic scene volume and learns a volumetric generative adversarial network (GAN) from a collection of 2.5D partial observations of 3D scenes. To this end, we apply a differentiable projection layer to project the generated 3D semantic scene volumes into semantic-segmented depth images and design a new multiple-view discriminator for learning the complete 3D scene volume from 2.5D semantic-segmented depth images. Compared to existing methods, our method not only efficiently reduces the workload of modeling and acquiring 3D scenes for training, but also produces better object shapes and their detailed layouts in the scene. We evaluate our method with different indoor scene datasets and demonstrate the advantages of our method. We also extend our method for generating 3D indoor scenes from semantic-segmented depth images inferred from RGB images of real scenes.

updated: Fri Aug 20 2021 06:22:49 GMT+0000 (UTC)

published: Fri Aug 20 2021 06:22:49 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト