VCGAN: Video Colorization with Hybrid Generative Adversarial Network

Yuzhi Zhao; Lai-Man Po; Wing-Yin Yu; Yasar Abbas Ur Rehman; Mengyang Liu; Yujia Zhang; Weifeng Ou

VCGAN：ハイブリッド生成的敵対的ネットワークによるビデオカラー化

エンドツーエンドの学習を使用したビデオカラー化への改善されたアプローチである、ハイブリッド生成的敵対的ネットワーク（VCGAN）を使用したハイブリッド反復ビデオカラー化を提案します。 VCGANは、ビデオのカラー化ドメインで一般的な2つの問題に対処します。時間的な一貫性と、カラー化ネットワークとリファインメントネットワークの単一アーキテクチャへの統合です。色付けの品質と時空間の一貫性を高めるために、VCGANのジェネレーターの主流は、それぞれグローバル特徴抽出器とプレースホルダー特徴抽出器という2つの追加ネットワークによって支援されます。グローバル特徴抽出器は、グレースケール入力のグローバルセマンティクスをエンコードして色付けの品質を向上させますが、プレースホルダー特徴抽出器は、時空間の一貫性を維持するために、前の色付けされたフレームのセマンティクスをエンコードするフィードバック接続として機能します。プレースホルダー特徴抽出器の入力をグレースケール入力として変更する場合、ハイブリッドVCGANは画像の色付けを実行する可能性もあります。遠いフレームの一貫性を改善するために、2つのリモートフレームごとの時間的不一致を滑らかにする高密度の長期損失を提案します。 VCGANは、カラー化と一時的な損失を組み合わせてトレーニングされており、色の鮮やかさとビデオの連続性のバランスが取れています。実験結果は、VCGANが既存のアプローチよりも高品質で時間的に一貫性のあるカラフルなビデオを生成することを示しています。

We propose a hybrid recurrent Video Colorization with Hybrid Generative Adversarial Network (VCGAN), an improved approach to video colorization using end-to-end learning. The VCGAN addresses two prevalent issues in the video colorization domain: Temporal consistency and unification of colorization network and refinement network into a single architecture. To enhance colorization quality and spatiotemporal consistency, the mainstream of generator in VCGAN is assisted by two additional networks, i.e., global feature extractor and placeholder feature extractor, respectively. The global feature extractor encodes the global semantics of grayscale input to enhance colorization quality, whereas the placeholder feature extractor acts as a feedback connection to encode the semantics of the previous colorized frame in order to maintain spatiotemporal consistency. If changing the input for placeholder feature extractor as grayscale input, the hybrid VCGAN also has the potential to perform image colorization. To improve the consistency of far frames, we propose a dense long-term loss that smooths the temporal disparity of every two remote frames. Trained with colorization and temporal losses jointly, VCGAN strikes a good balance between color vividness and video continuity. Experimental results demonstrate that VCGAN produces higher-quality and temporally more consistent colorful videos than existing approaches.

updated: Mon Apr 26 2021 05:50:53 GMT+0000 (UTC)

published: Mon Apr 26 2021 05:50:53 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト