To fit or not to fit: Model-based Face Reconstruction and Occlusion Segmentation from Weak Supervision

Chunlu Li; Andreas Morel-Forster; Thomas Vetter; Bernhard Egger; Adam Kortylewski

フィットするかしないか：弱い監視からのモデルベースの顔の再構成とオクルージョンセグメンテーション

単一の画像からの3D顔の再構成は、その不適切な性質のために困難です。モデルベースの顔オートエンコーダは、弱く監視された方法で顔モデルをターゲット画像に適合させることにより、この問題に効果的に対処します。ただし、制約のない環境では、モデルが誤ってオクルージョンされた顔領域に適応しようとすることが多いため、オクルージョンは顔の再構成を歪めます。監視されたオクルージョンセグメンテーションは、オクルージョンされた顔領域のフィッティングを回避するための実行可能なソリューションですが、大量の注釈付きトレーニングデータが必要です。この作業では、モデルベースの顔オートエンコーダーがトレーニング中に追加の監視を必要とせずにオクルーダーを正確にセグメント化できるようにします。これにより、モデルがフィットする領域とフィットしない領域が分離されます。これを実現するために、顔のオートエンコーダーをセグメンテーションネットワークで拡張します。セグメンテーションネットワークは、ピクセルを含めることとモデルをそれらに適応させることの間のトレードオフのバランスに到達し、モデルのフィッティングが悪影響を受けないようにピクセルを除外することによって、モデルがどの領域に適応する必要があるかを決定し、表示するピクセルでより高い全体的な再構成精度に到達します顔。これは相乗効果につながり、オクルージョンセグメンテーションが顔オートエンコーダのトレーニングをガイドして非オクルージョン領域のフィッティングを制約し、フィッティングの改善によりセグメンテーションモデルがオクルージョンされた顔領域をより適切に予測できるようになります。 CelebA-HQデータベースとARデータベースでの定性的および定量的実験により、オクルージョン下での3D顔再構成の改善、および弱い監視のみからの正確なオクルージョンセグメンテーションの有効化におけるモデルの有効性が検証されます。コードはhttps://github.com/unibas-gravis/Occlusion-Robust-MoFAで入手できます。

3D face reconstruction from a single image is challenging due to its ill-posed nature. Model-based face autoencoders address this issue effectively by fitting a face model to the target image in a weakly supervised manner. However, in unconstrained environments occlusions distort the face reconstruction because the model often erroneously tries to adapt to occluded face regions. Supervised occlusion segmentation is a viable solution to avoid the fitting of occluded face regions, but it requires a large amount of annotated training data. In this work, we enable model-based face autoencoders to segment occluders accurately without requiring any additional supervision during training, and this separates regions where the model will be fitted from those where it will not be fitted. To achieve this, we extend face autoencoders with a segmentation network. The segmentation network decides which regions the model should adapt to by reaching balances in a trade-off between including pixels and adapting the model to them, and excluding pixels so that the model fitting is not negatively affected and reaches higher overall reconstruction accuracy on pixels showing the face. This leads to a synergistic effect, in which the occlusion segmentation guides the training of the face autoencoder to constrain the fitting in the non-occluded regions, while the improved fitting enables the segmentation model to better predict the occluded face regions. Qualitative and quantitative experiments on the CelebA-HQ database and the AR database verify the effectiveness of our model in improving 3D face reconstruction under occlusions and in enabling accurate occlusion segmentation from weak supervision only. Code available at https://github.com/unibas-gravis/Occlusion-Robust-MoFA.

updated: Thu Jun 17 2021 15:52:19 GMT+0000 (UTC)

published: Thu Jun 17 2021 15:52:19 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト