Learning Full-Head 3D GANs from a Single-View Portrait Dataset

Yiqian Wu; Hao Xu; Xiangjun Tang; Hongbo Fu; Xiaogang Jin

単一ビューの肖像画データセットから全頭 3D GAN を学習する

33D 対応の顔ジェネレータは、通常、2D 現実の顔画像データセットでトレーニングされます。それにもかかわらず、既存の顔認識方法では、さまざまなカメラ角度からキャプチャされた顔データを抽出するのに苦労することがよくあります。さらに、さまざまな体のポーズを含む自然界の画像は、3D 対応のジェネレーターにとって高次元の課題をもたらし、完全な首と肩の領域を含むデータを利用することが困難になります。その結果、これらの顔画像データセットには、正面付近の顔データのみが含まれることが多く、3D 対応の顔ジェネレーターが全頭 3D ポートレートを構築する際に課題が生じます。この目的を達成するために、まずデータセット {360^∘}-Portrait-HQ (360^∘PHQ) を作成します。これは、さまざまなカメラパラメーターで注釈が付けられた高品質の単一ビューの実際のポートレートで構成されます {(ヨー角は、 360^∘ 範囲全体)} と体のポーズ。次に、身体ポーズの自己学習を使用して、身体ポーズのさまざまな 360^∘PHQ データセットから標準 3D アバター分布を学習する、初の 3D 対応の全頭ポートレートジェネレータである 3DPortraitGAN を提案します。私たちのモデルは、全頭 3D 表現を使用して、すべてのカメラ角度 (360^∘) からビュー一貫性のあるポートレート画像を生成できます。メッシュガイドによる変形フィールドをボリュームレンダリングに組み込んで変形結果を生成し、標準ジェネレーターを使用してデータセットの身体姿勢分布に準拠するポートレート画像を生成します。 2 つのポーズ予測子をフレームワークに統合して、より正確な体のポーズを予測し、データセット内で不正確に推定された体のポーズの問題に対処します。私たちの実験では、提案されたフレームワークが、すべてのカメラ角度から完全なジオメトリを備えたビュー一貫性のあるリアルなポートレート画像を生成し、ポートレートの体のポーズを正確に予測できることを示しています。

33D-aware face generators are commonly trained on 2D real-life face image datasets. Nevertheless, existing facial recognition methods often struggle to extract face data captured from various camera angles. Furthermore, in-the-wild images with diverse body poses introduce a high-dimensional challenge for 3D-aware generators, making it difficult to utilize data that contains complete neck and shoulder regions. Consequently, these face image datasets often contain only near-frontal face data, which poses challenges for 3D-aware face generators to construct full-head 3D portraits. To this end, we first create the dataset {360^∘}-Portrait-HQ (360^∘PHQ), which consists of high-quality single-view real portraits annotated with a variety of camera parameters {(the yaw angles span the entire 360^∘ range)} and body poses. We then propose 3DPortraitGAN, the first 3D-aware full-head portrait generator that learns a canonical 3D avatar distribution from the body-pose-various 360^∘PHQ dataset with body pose self-learning. Our model can generate view-consistent portrait images from all camera angles (360^∘) with a full-head 3D representation. We incorporate a mesh-guided deformation field into volumetric rendering to produce deformed results to generate portrait images that conform to the body pose distribution of the dataset using our canonical generator. We integrate two pose predictors into our framework to predict more accurate body poses to address the issue of inaccurately estimated body poses in our dataset. Our experiments show that the proposed framework can generate view-consistent, realistic portrait images with complete geometry from all camera angles and accurately predict portrait body pose.

updated: Thu Jul 27 2023 11:02:36 GMT+0000 (UTC)

published: Thu Jul 27 2023 11:02:36 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト