Continuous Facial Motion Deblurring

Tae Bok Lee; Sujy Han; Yong Seok Heo

連続的な顔の動きのぼけ除去

モーメント制御因子を介して単一のモーションブラーされた顔画像に潜む連続的な鋭いモーメントを復元する、連続的な顔のモーションブラー除去のための新しいフレームワークを紹介します。モーションブラー画像は、露光時間中の連続的な鋭いモーメントの蓄積信号ですが、ほとんどの既存の単一画像ぼけ除去アプローチは、複数のネットワークとトレーニングステージを使用して一定数のフレームを復元することを目的としています。この問題に対処するために、GAN（CFMD-GAN）に基づく連続的な顔の動きのぼけ除去ネットワークを提案します。これは、単一のネットワークと単一のトレーニングステージで単一の動きのぼやけた顔画像に潜む連続モーメントを復元するための新しいフレームワークです。。ネットワークトレーニングを安定させるために、顔のドメイン固有の知識を利用して、顔の動きに基づく並べ替えプロセス（FMR）によって決定された順序で連続モーメントを復元するようにジェネレータをトレーニングします。さらに、連続的な鋭いモーメントを推定することにより、ジェネレータがより正確な画像を生成するのに役立つ補助リグレッサを提案します。さらに、制御因子の関数として空間的に変形可能な畳み込みとチャネルごとの注意を実行する制御適応（ContAda）ブロックを紹介します。 300VWデータセットでの広範な実験は、提案されたフレームワークがモーメント制御係数を変化させることによってさまざまな数の連続出力フレームを生成することを示しています。同じ300VWトレーニングセットでトレーニングされた最近の単一から単一の画像ぼけ除去ネットワークと比較して、提案された方法は、LPIPS、FID、Arcfaceアイデンティティ距離などの知覚メトリックの観点から中央のシャープなフレームを復元する優れたパフォーマンスを示しています。提案された方法は、定性的および定量的比較の両方において、既存の単一からビデオへのぼけ除去方法よりも優れている。

We introduce a novel framework for continuous facial motion deblurring that restores the continuous sharp moment latent in a single motion-blurred face image via a moment control factor. Although a motion-blurred image is the accumulated signal of continuous sharp moments during the exposure time, most existing single image deblurring approaches aim to restore a fixed number of frames using multiple networks and training stages. To address this problem, we propose a continuous facial motion deblurring network based on GAN (CFMD-GAN), which is a novel framework for restoring the continuous moment latent in a single motion-blurred face image with a single network and a single training stage. To stabilize the network training, we train the generator to restore continuous moments in the order determined by our facial motion-based reordering process (FMR) utilizing domain-specific knowledge of the face. Moreover, we propose an auxiliary regressor that helps our generator produce more accurate images by estimating continuous sharp moments. Furthermore, we introduce a control-adaptive (ContAda) block that performs spatially deformable convolution and channel-wise attention as a function of the control factor. Extensive experiments on the 300VW datasets demonstrate that the proposed framework generates a various number of continuous output frames by varying the moment control factor. Compared with the recent single-to-single image deblurring networks trained with the same 300VW training set, the proposed method show the superior performance in restoring the central sharp frame in terms of perceptual metrics, including LPIPS, FID and Arcface identity distance. The proposed method outperforms the existing single-to-video deblurring method for both qualitative and quantitative comparisons.

updated: Thu Jul 14 2022 02:53:37 GMT+0000 (UTC)

published: Thu Jul 14 2022 02:53:37 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト