Text2Face: A Multi-Modal 3D Face Model

Will Rowan; Patrik Huber; Nick Pears; Andrew Keeling

Text2Face: マルチモーダル 3D 顔モデル

最初の 3D モーフィング可能なモデリングアプローチを提示します。これにより、テキストプロンプトを使用して 3D 顔の形状を直接かつ完全に定義できます。マルチモーダル学習の作業に基づいて、FLAME ヘッドモデルを共通の画像とテキストの潜在空間に拡張します。これにより、3D Morphable Model (3DMM) パラメータを直接生成できるため、テキスト記述から形状を操作できます。メソッド Text2Face には多くのアプリケーションがあります。たとえば、入力がすでに自然言語である警察の写真を生成します。さらに、画像だけでなく、スケッチや彫刻へのマルチモーダル 3DMM 画像フィッティングも可能になります。

We present the first 3D morphable modelling approach, whereby 3D face shape can be directly and completely defined using a textual prompt. Building on work in multi-modal learning, we extend the FLAME head model to a common image-and-text latent space. This allows for direct 3D Morphable Model (3DMM) parameter generation and therefore shape manipulation from textual descriptions. Our method, Text2Face, has many applications; for example: generating police photofits where the input is already in natural language. It further enables multi-modal 3DMM image fitting to sketches and sculptures, as well as images.

updated: Wed Mar 08 2023 11:28:21 GMT+0000 (UTC)

published: Sun Mar 05 2023 15:06:54 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト