Face sketch to photo translation using generative adversarial networks

Nastaran Moradzadeh Farid; Maryam Saeedi Fard; Ahmad Nickabadi

生成的敵対的ネットワークを使用した顔のスケッチから写真への翻訳

顔のスケッチを写実的な顔に変換することは、法執行機関やデジタルエンターテインメント業界などの多くのアプリケーションで興味深く不可欠なタスクです。このタスクの最も重要な課題の1つは、スケッチ内の皮膚組織の色や詳細の欠如など、スケッチと実際の画像との固有の違いです。敵対的な生成モデルの出現により、スケッチから画像への合成のために提案される方法が増えています。ただし、これらのモデルには、トレーニングに必要なペアデータの数が多い、生成された画像の解像度が低い、生成された画像の外観が非現実的であるなどの制限があります。この論文では、ペアのデータセットを必要とせずに、入力された顔のスケッチをカラフルな写真に変換する方法を提案します。そのために、事前にトレーニングされた顔写真生成モデルを使用して高品質の自然な顔写真を合成し、最適化手順を使用して入力スケッチの忠実度を維持します。入力スケッチから抽出された顔の特徴を、顔生成モデルの潜在空間内のベクトルにマッピングするネットワークをトレーニングします。また、さまざまな最適化基準を検討し、提案されたモデルの結果を最先端のモデルの結果と定量的および定性的に比較します。提案されたモデルは、SSIMインデックスで0.655、ランク1の顔認識率97.59％を達成し、生成された画像の品質が向上しました。

Translating face sketches to photo-realistic faces is an interesting and essential task in many applications like law enforcement and the digital entertainment industry. One of the most important challenges of this task is the inherent differences between the sketch and the real image such as the lack of color and details of the skin tissue in the sketch. With the advent of adversarial generative models, an increasing number of methods have been proposed for sketch-to-image synthesis. However, these models still suffer from limitations such as the large number of paired data required for training, the low resolution of the produced images, or the unrealistic appearance of the generated images. In this paper, we propose a method for converting an input facial sketch to a colorful photo without the need for any paired dataset. To do so, we use a pre-trained face photo generating model to synthesize high-quality natural face photos and employ an optimization procedure to keep high-fidelity to the input sketch. We train a network to map the facial features extracted from the input sketch to a vector in the latent space of the face generating model. Also, we study different optimization criteria and compare the results of the proposed model with those of the state-of-the-art models quantitatively and qualitatively. The proposed model achieved 0.655 in the SSIM index and 97.59% rank-1 face recognition rate with higher quality of the produced images.

updated: Sat Oct 23 2021 20:01:20 GMT+0000 (UTC)

published: Sat Oct 23 2021 20:01:20 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト