PV3D: A 3D Generative Model for Portrait Video Generation

Eric Zhongcong Xu; Jianfeng Zhang; Jun Hao Liew; Wenqing Zhang; Song Bai; Jiashi Feng; Mike Zheng Shou

PV3D: ポートレート動画生成用の 3D 生成モデル

敵対的生成ネットワーク (GAN) の最近の進歩により、見事な写真のようにリアルなポートレート画像を生成する機能が実証されました。以前の研究では、このような画像 GAN を無条件の 2D ポートレートビデオ生成と静的 3D ポートレート合成に適用したものがありますが、GAN を拡張して 3D 対応のポートレートビデオを生成することに成功した研究はほとんどありません。この作業では、マルチビューの一貫したポートレートビデオを合成できる最初の生成フレームワークである PV3D を提案します。具体的には、私たちの方法は、時空間空間をモデル化するために 3D 暗黙的ニューラル表現を一般化することにより、最近の静的 3D 認識画像 GAN をビデオドメインに拡張します。モーションダイナミクスを生成プロセスに導入するために、複数のモーションレイヤーをスタックして、変調された畳み込みによってモーション機能を生成するモーションジェネレーターを開発します。カメラ/人間の動きによって引き起こされる動きのあいまいさを軽減するために、PV3D のシンプルで効果的なカメラ条件戦略を提案し、一時的およびマルチビューの一貫したビデオ生成を可能にします。さらに、PV3D は、生成されたポートレートビデオの妥当性を確保するために、空間ドメインと時間ドメインを正則化するための 2 つの弁別子を導入します。これらの精巧な設計により、PV3D は高品質の外観と形状を備えた 3D 対応のモーションのもっともらしいポートレートビデオを生成でき、以前の作品よりも大幅に優れています。その結果、PV3D は、静的ポートレートのアニメーション化やビューの一貫性のあるビデオモーション編集など、多くのダウンストリームアプリケーションをサポートできます。コードとモデルは https://showlab.github.io/pv3d でリリースされています。

Recent advances in generative adversarial networks (GANs) have demonstrated the capabilities of generating stunning photo-realistic portrait images. While some prior works have applied such image GANs to unconditional 2D portrait video generation and static 3D portrait synthesis, there are few works successfully extending GANs for generating 3D-aware portrait videos. In this work, we propose PV3D, the first generative framework that can synthesize multi-view consistent portrait videos. Specifically, our method extends the recent static 3D-aware image GAN to the video domain by generalizing the 3D implicit neural representation to model the spatio-temporal space. To introduce motion dynamics to the generation process, we develop a motion generator by stacking multiple motion layers to generate motion features via modulated convolution. To alleviate motion ambiguities caused by camera/human motions, we propose a simple yet effective camera condition strategy for PV3D, enabling both temporal and multi-view consistent video generation. Moreover, PV3D introduces two discriminators for regularizing the spatial and temporal domains to ensure the plausibility of the generated portrait videos. These elaborated designs enable PV3D to generate 3D-aware motion-plausible portrait videos with high-quality appearance and geometry, significantly outperforming prior works. As a result, PV3D is able to support many downstream applications such as animating static portraits and view-consistent video motion editing. Code and models are released at https://showlab.github.io/pv3d.

updated: Wed Feb 01 2023 02:57:14 GMT+0000 (UTC)

published: Tue Dec 13 2022 05:42:44 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト