3D-Aware Semantic-Guided Generative Model for Human Synthesis

Jichao Zhang; Enver Sangineto; Hao Tang; Aliaksandr Siarohin; Zhun Zhong; Nicu Sebe; Wei Wang

人間の合成のための3D対応のセマンティックガイド生成モデル

2D画像から暗黙の3D表現を抽出するGenerativeNeural Radiance Field（GNeRF）モデルは、人間の顔や車などの硬い物体を表すリアルな画像を生成することが最近示されています。ただし、通常、人体などの硬くないオブジェクトを表す高品質の画像を生成するのに苦労します。これは、多くのコンピュータグラフィックスアプリケーションにとって非常に興味深いものです。この論文は、GNeRFとテクスチャジェネレータを統合する、人間の画像合成のための3D対応のセマンティックガイド生成モデル（3D-SGAN）を提案します。前者は、人体の暗黙的な3D表現を学習し、2Dセマンティックセグメンテーションマスクのセットを出力します。後者は、これらのセマンティックマスクを実際の画像に変換し、人間の外観にリアルなテクスチャを追加します。追加の3D情報を必要とせずに、私たちのモデルは、フォトリアリスティックな制御可能な生成で3D人間表現を学習できます。 DeepFashionデータセットでの実験では、3D-SGANが最新のベースラインを大幅に上回っていることを示しています。

Generative Neural Radiance Field (GNeRF) models, which extract implicit 3D representations from 2D images, have recently been shown to produce realistic images representing rigid objects, such as human faces or cars. However, they usually struggle to generate high-quality images representing non-rigid objects, such as the human body, which is of a great interest for many computer graphics applications. This paper proposes a 3D-aware Semantic-Guided Generative Model (3D-SGAN) for human image synthesis, which integrates a GNeRF and a texture generator. The former learns an implicit 3D representation of the human body and outputs a set of 2D semantic segmentation masks. The latter transforms these semantic masks into a real image, adding a realistic texture to the human appearance. Without requiring additional 3D information, our model can learn 3D human representations with a photo-realistic controllable generation. Our experiments on the DeepFashion dataset show that 3D-SGAN significantly outperforms the most recent baselines.

updated: Thu Dec 02 2021 17:10:53 GMT+0000 (UTC)

published: Thu Dec 02 2021 17:10:53 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト