DINAR: Diffusion Inpainting of Neural Textures for One-Shot Human Avatars

David Svitov; Dmitrii Gudkov; Renat Bashirov; Victor Lemptisky

DINAR: ワンショットヒューマンアバターのニューラルテクスチャの拡散修復

DINAR は、単一の RGB 画像からリアルな装備の全身アバターを作成するためのアプローチです。以前の作品と同様に、私たちの方法はSMPL-Xボディモデルと組み合わせたニューラルテクスチャを使用して、アバターの写真のようにリアルな品質を実現しながら、アニメーション化を容易にし、推論を高速に保ちます。テクスチャを復元するために、潜在拡散モデルを使用し、そのようなモデルをニューラルテクスチャ空間でトレーニングする方法を示します。拡散モデルを使用すると、正面から見た人の背中など、見えない大きな領域をリアルに再構築できます。パイプラインのモデルは、2D 画像とビデオのみを使用してトレーニングされます。実験では、私たちのアプローチは、最先端のレンダリング品質と、新しいポーズと視点への優れた一般化を実現します。特に、このアプローチは、SnapshotPeople パブリックベンチマークの最先端を向上させます。

We present DINAR, an approach for creating realistic rigged fullbody avatars from single RGB images. Similarly to previous works, our method uses neural textures combined with the SMPL-X body model to achieve photo-realistic quality of avatars while keeping them easy to animate and fast to infer. To restore the texture, we use a latent diffusion model and show how such model can be trained in the neural texture space. The use of the diffusion model allows us to realistically reconstruct large unseen regions such as the back of a person given the frontal view. The models in our pipeline are trained using 2D images and videos only. In the experiments, our approach achieves state-of-the-art rendering quality and good generalization to new poses and viewpoints. In particular, the approach improves state-of-the-art on the SnapshotPeople public benchmark.

updated: Thu Mar 16 2023 15:04:10 GMT+0000 (UTC)

published: Thu Mar 16 2023 15:04:10 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト