Temporally Consistent Semantic Video Editing

Yiran Xu; Badour AlBahar; Jia-Bin Huang

一時的に一貫したセマンティックビデオ編集

生成的敵対的ネットワーク（GAN）は、オブジェクトクラスの変更、属性の変更、スタイルの転送など、実際の画像の印象的な画像生成品質とセマンティック編集機能を実証しています。ただし、これらのGANベースの編集をフレームごとに個別にビデオに適用すると、必然的に一時的なちらつきアーティファクトが発生します。時間的にコヒーレントなビデオ編集を容易にするためのシンプルで効果的な方法を紹介します。私たちの中心的なアイデアは、潜在コードと事前にトレーニングされたジェネレーターの両方を最適化することにより、時間的な測光の不整合を最小限に抑えることです。さまざまなドメインとGAN反転手法での編集の品質を評価し、ベースラインに対して良好な結果を示しています。

Generative adversarial networks (GANs) have demonstrated impressive image generation quality and semantic editing capability of real images, e.g., changing object classes, modifying attributes, or transferring styles. However, applying these GAN-based editing to a video independently for each frame inevitably results in temporal flickering artifacts. We present a simple yet effective method to facilitate temporally coherent video editing. Our core idea is to minimize the temporal photometric inconsistency by optimizing both the latent code and the pre-trained generator. We evaluate the quality of our editing on different domains and GAN inversion techniques and show favorable results against the baselines.

updated: Tue Jun 21 2022 17:59:59 GMT+0000 (UTC)

published: Tue Jun 21 2022 17:59:59 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト