Pro-UIGAN: Progressive Face Hallucination from Occluded Thumbnails

Yang Zhang; Xin Yu; Xiaobo Lu; Ping Liu

Pro-UIGAN：閉塞したサムネイルからの進行性の顔の幻覚

この論文では、本物の高解像度（HR）顔を隠されたサムネイルから幻覚化するタスクを研究します。 Pro-UIGANと呼ばれる多段階のプログレッシブアップサンプリングおよび修復生成敵対的ネットワークを提案します。これは、顔のジオメトリを活用して、閉塞した小さな顔（16 * 16ピクセル）を補充およびアップサンプリング（8 *）します。 Pro-UIGANは、（1）低解像度（LR）の顔の顔のジオメトリの事前分布を繰り返し推定し、（2）推定された事前分布のガイダンスの下で、遮られていないHR顔画像を取得します。当社の多段階幻覚ネットワークは、閉塞したLR面を粗い方法から細かい方法で超解像して修復するため、不要なぼやけやアーティファクトを大幅に削減します。具体的には、顔の事前推定用の新しいクロスモーダルトランスフォーマーモジュールを設計します。このモジュールでは、入力顔とそのランドマーク機能がそれぞれクエリとキーとして定式化されます。このようなデザインは、入力された顔とランドマークの特徴全体での共同特徴学習を促進し、深い特徴の対応が注意によって発見されます。したがって、顔の外観の特徴と顔の幾何学の事前情報は、相互に促進する方法で学習されます。広範な実験により、Pro-UIGANは視覚的に心地よいHR顔を実現し、他の最先端の（SotA）手法と比較して、顔の位置合わせ、顔の解析、顔の認識、表情の分類などのダウンストリームタスクで優れたパフォーマンスを発揮することが実証されています。

In this paper, we study the task of hallucinating an authentic high-resolution (HR) face from an occluded thumbnail. We propose a multi-stage Progressive Upsampling and Inpainting Generative Adversarial Network, dubbed Pro-UIGAN, which exploits facial geometry priors to replenish and upsample (8*) the occluded and tiny faces (16*16 pixels). Pro-UIGAN iteratively (1) estimates facial geometry priors for low-resolution (LR) faces and (2) acquires non-occluded HR face images under the guidance of the estimated priors. Our multi-stage hallucination network super-resolves and inpaints occluded LR faces in a coarse-to-fine manner, thus reducing unwanted blurriness and artifacts significantly. Specifically, we design a novel cross-modal transformer module for facial priors estimation, in which an input face and its landmark features are formulated as queries and keys, respectively. Such a design encourages joint feature learning across the input facial and landmark features, and deep feature correspondences will be discovered by attention. Thus, facial appearance features and facial geometry priors are learned in a mutual promotion manner. Extensive experiments demonstrate that our Pro-UIGAN achieves visually pleasing HR faces, reaching superior performance in downstream tasks, i.e., face alignment, face parsing, face recognition and expression classification, compared with other state-of-the-art (SotA) methods.

updated: Sun Aug 08 2021 08:34:07 GMT+0000 (UTC)

published: Mon Aug 02 2021 02:29:24 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト