Face Hallucination with Finishing Touches

Yang Zhang; Ivor W. Tsang; Jun Li; Ping Liu; Xiaobo Lu; Xin Yu

最後の仕上げで顔の幻覚

低解像度（LR）の非正面顔画像から高品質の正面顔画像を取得することは、多くの顔分析アプリケーションにとって主に重要です。ただし、主流は、超解像のほぼ正面のLR面、または正面の非正面の高解像度（HR）面のいずれかに焦点を合わせています。日常生活の制約のない顔画像では、両方のタスクをシームレスに実行することが望ましい。この論文では、小さな非正面顔画像を同時に超解像および正面化するための新しいVivid Face Hallucination Generative Adversarial Network（VividGAN）を紹介します。 VividGANは、粗いレベルと細かいレベルの顔幻覚ネットワーク（FHnet）と、2つの弁別器（Coarse-DとFine-D）で構成されています。粗いレベルのFHnetは、正面の粗いHR顔を生成し、次に、細かいレベルのFHnetは、前の顔のコンポーネントの外観、つまり、きめの細かい顔のコンポーネントを利用して、本物の詳細を備えた正面のHR顔画像を取得します。ファインレベルのFHnetでは、顔のジオメトリガイダンスを手がかりとして採用し、正面の粗いHR顔と以前の情報を正確に位置合わせしてマージする、顔コンポーネント対応モジュールも設計します。一方、2レベルの弁別器は、顔画像の全体的な輪郭と詳細な顔の特徴の両方をキャプチャするように設計されています。 Coarse-Dは、粗く幻覚を起こした顔を直立させて完全なものにしますが、Fine-Dは、細かい幻覚を表現した顔に焦点を合わせて、細部をより鮮明にします。広範な実験により、VividGANがフォトリアリスティックな正面HR顔を実現し、他の最先端の方法と比較して、顔認識や表情分類などのダウンストリームタスクで優れたパフォーマンスを達成することが実証されています。

Obtaining a high-quality frontal face image from a low-resolution (LR) non-frontal face image is primarily important for many facial analysis applications. However, mainstreams either focus on super-resolving near-frontal LR faces or frontalizing non-frontal high-resolution (HR) faces. It is desirable to perform both tasks seamlessly for daily-life unconstrained face images. In this paper, we present a novel Vivid Face Hallucination Generative Adversarial Network (VividGAN) for simultaneously super-resolving and frontalizing tiny non-frontal face images. VividGAN consists of coarse-level and fine-level Face Hallucination Networks (FHnet) and two discriminators, i.e., Coarse-D and Fine-D. The coarse-level FHnet generates a frontal coarse HR face and then the fine-level FHnet makes use of the facial component appearance prior, i.e., fine-grained facial components, to attain a frontal HR face image with authentic details. In the fine-level FHnet, we also design a facial component-aware module that adopts the facial geometry guidance as clues to accurately align and merge the frontal coarse HR face and prior information. Meanwhile, two-level discriminators are designed to capture both the global outline of a face image as well as detailed facial characteristics. The Coarse-D enforces the coarsely hallucinated faces to be upright and complete while the Fine-D focuses on the fine hallucinated ones for sharper details. Extensive experiments demonstrate that our VividGAN achieves photo-realistic frontal HR faces, reaching superior performance in downstream tasks, i.e., face recognition and expression classification, compared with other state-of-the-art methods.

updated: Wed Oct 28 2020 03:10:44 GMT+0000 (UTC)

published: Sun Feb 09 2020 07:33:48 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト