Iterative Facial Image Inpainting Based on an Encoder-Generator Architecture

Yahya Dogan; Hacer Yalim Keles

エンコーダ-ジェネレータアーキテクチャに基づく反復顔画像修復

顔の画像の修復は、目や鼻などの顔のマスクされた主要コンポーネントのセマンティック情報を含む新しいピクセルを生成する必要があるため、難しい問題です。最近、この分野で注目すべき方法が提案されています。これらのアプローチのほとんどは、エンコーダ-デコーダアーキテクチャを使用し、特定の画像と特定のマスクに対して一意の結果を許可するなど、さまざまな制限があります。あるいは、いくつかの最適化ベースのアプローチは、ジェネレータネットワークで異なるマスクを使用して有望な結果を生成します。ただし、これらのアプローチは計算コストが高くなります。この論文では、エンコーダジェネレータモデルを提供するCyclic Reverse Generator（CRG）アーキテクチャを使用して、顔画像の修復問題に対する効率的なソリューションを提案します。エンコーダーを使用して、特定の画像をジェネレータースペースに埋め込み、妥当な画像が生成されるまでマスクされた領域を段階的にインペイントします。反復中に生成された画像の品質を評価し、収束を決定するために、弁別子モデルをトレーニングしました。生成プロセスの後、後処理には、このタスク用に特別にトレーニングしたUnetモデルを使用して、マスク境界に近いアーティファクトを修正します。提案されたモデルでリアルな画像を生成するには、数回の反復で十分であることが経験的に観察されました。モデルは特定のマスクタイプ用にトレーニングされていないため、この方法では、スケッチベースの修復を適用し、さまざまなマスクタイプを使用して、複数の多様な結果を生成できます。私たちの方法を最新のモデルと定量的および定性的に比較し、私たちの方法がすべてのマスクタイプで他のモデルと競合できることを観察しました。これは、より大きなマスクが使用されている画像で特に優れています。コード、データセット、モデルは、https：//github.com/yahyadogan72/iterative face imageinpaintingで入手できます。

Facial image inpainting is a challenging problem as it requires generating new pixels that include semantic information for masked key components in a face, e.g., eyes and nose. Recently, remarkable methods have been proposed in this field. Most of these approaches use encoder-decoder architectures and have different limitations such as allowing unique results for a given image and a particular mask. Alternatively, some optimization-based approaches generate promising results using different masks with generator networks. However, these approaches are computationally more expensive. In this paper, we propose an efficient solution to the facial image inpainting problem using the Cyclic Reverse Generator (CRG) architecture, which provides an encoder-generator model. We use the encoder to embed a given image to the generator space and incrementally inpaint the masked regions until a plausible image is generated; we trained a discriminator model to assess the quality of the generated images during the iterations and determine the convergence. After the generation process, for the post-processing, we utilize a Unet model that we trained specifically for this task to remedy the artifacts close to the mask boundaries. We empirically observed that only a few iterations are sufficient to generate realistic images with the proposed model. Since the models are not trained for particular mask types, our method allows applying sketch-based inpaintings, using a variety of mask types, and producing multiple and diverse results. We compared our method with the state-of-the-art models both quantitatively and qualitatively, and observed that our method can compete with the other models in all mask types; it is particularly better in images where larger masks are utilized. Our code, dataset and models are available at: https://github.com/yahyadogan72/iterative facial image inpainting.

updated: Sun Feb 13 2022 11:11:59 GMT+0000 (UTC)

published: Mon Jan 18 2021 12:19:58 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト