HQ3DAvatar: High Quality Controllable 3D Head Avatar

Kartik Teotia; Mallikarjun B R; Xingang Pan; Hyeongwoo Kim; Pablo Garrido; Mohamed Elgharib; Christian Theobalt

HQ3DAvatar: 高品質で操作可能な 3D ヘッドアバター

マルチビューボリュームレンダリング技術は、最近、高品質のヘッドアバターのモデリングと合成において大きな可能性を示しています。頭全体の動的パフォーマンスをキャプチャする一般的な方法は、メッシュベースのテンプレートまたは 3D キューブベースのグラフィックスプリミティブを使用して、基になるジオメトリを追跡することです。これらのモデルベースのアプローチでは有望な結果が得られますが、口の内部、髪、経時的なトポロジーの変化などの複雑な幾何学的詳細を学習できないことがよくあります。このペーパーでは、非常にフォトリアリスティックなデジタルヘッドアバターを構築するための新しいアプローチを紹介します。私たちの方法は、ニューラルネットワークによってパラメーター化された陰関数を介して正準空間を学習します。学習した特徴空間で多重解像度のハッシュエンコーディングを活用し、高品質で高速なトレーニングと高解像度のレンダリングを可能にします。テスト時に、私たちの方法は単眼RGBビデオによって駆動されます。ここで、画像エンコーダーは、学習可能な正準空間も調整する顔固有の特徴を抽出します。これにより、トレーニング中に変形に依存するテクスチャのバリエーションが促進されます。また、学習した正規空間での対応を保証する新しいオプティカルフローベースの損失を提案し、アーティファクトのない一時的に一貫したレンダリングを促進します。挑戦的な顔の表情の結果を表示し、中程度の画像解像度のインタラクティブなリアルタイムレートで自由な視点のレンダリングを表示します。私たちの方法は、視覚的にも数値的にも、既存のすべてのアプローチよりも優れています。さらなる研究を促進するために、複数の ID データセットをリリースします。私たちのプロジェクトページはhttps://vcai.mpi-inf.mpg.de/projects/HQ3DAvatar/にあります。

Multi-view volumetric rendering techniques have recently shown great potential in modeling and synthesizing high-quality head avatars. A common approach to capture full head dynamic performances is to track the underlying geometry using a mesh-based template or 3D cube-based graphics primitives. While these model-based approaches achieve promising results, they often fail to learn complex geometric details such as the mouth interior, hair, and topological changes over time. This paper presents a novel approach to building highly photorealistic digital head avatars. Our method learns a canonical space via an implicit function parameterized by a neural network. It leverages multiresolution hash encoding in the learned feature space, allowing for high-quality, faster training and high-resolution rendering. At test time, our method is driven by a monocular RGB video. Here, an image encoder extracts face-specific features that also condition the learnable canonical space. This encourages deformation-dependent texture variations during training. We also propose a novel optical flow based loss that ensures correspondences in the learned canonical space, thus encouraging artifact-free and temporally consistent renderings. We show results on challenging facial expressions and show free-viewpoint renderings at interactive real-time rates for medium image resolutions. Our method outperforms all existing approaches, both visually and numerically. We will release our multiple-identity dataset to encourage further research. Our Project page is available at: https://vcai.mpi-inf.mpg.de/projects/HQ3DAvatar/

updated: Sat Mar 25 2023 13:56:33 GMT+0000 (UTC)

published: Sat Mar 25 2023 13:56:33 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト