Latent Image Animator: Learning to Animate Images via Latent Space Navigation

Yaohui Wang; Di Yang; Francois Bremond; Antitza Dantcheva

潜像アニメーター：潜像ナビゲーションを介して画像をアニメーション化する方法を学ぶ

深い生成モデルの目覚ましい進歩により、画像のアニメーション化はますます効率的になり、関連する結果はますます現実的になりました。現在のアニメーションアプローチは、一般的に、運転中のビデオから抽出された構造表現を利用します。このような構造表現は、運転中のビデオから静止画像にモーションを転送するのに役立ちます。ただし、ソース画像とドライビングビデオに大きな外観の変化が含まれている場合、このようなアプローチは失敗します。さらに、構造情報の抽出には、アニメーションモデルの複雑さを増す追加のモジュールが必要です。このようなモデルから逸脱して、ここでは、構造表現の必要性を回避する自己監視型オートエンコーダであるLatent Image Animator（LIA）を紹介します。 LIAは、潜在空間での線形ナビゲーションによって画像をアニメーション化するように合理化されています。具体的には、生成されたビデオの動きは、潜在空間内のコードの線形変位によって構築されます。これに向けて、潜在空間の変位を表すために、一連の直交運動方向を同時に学習し、それらの線形結合を使用します。広範な定量的および定性的分析は、私たちのモデルが、生成された品質でVoxCeleb、Taichi、およびTED-talkデータセットの最先端の方法を体系的かつ大幅に上回っていることを示唆しています。

Due to the remarkable progress of deep generative models, animating images has become increasingly efficient, whereas associated results have become increasingly realistic. Current animation-approaches commonly exploit structure representation extracted from driving videos. Such structure representation is instrumental in transferring motion from driving videos to still images. However, such approaches fail in case the source image and driving video encompass large appearance variation. Moreover, the extraction of structure information requires additional modules that endow the animation-model with increased complexity. Deviating from such models, we here introduce the Latent Image Animator (LIA), a self-supervised autoencoder that evades need for structure representation. LIA is streamlined to animate images by linear navigation in the latent space. Specifically, motion in generated video is constructed by linear displacement of codes in the latent space. Towards this, we learn a set of orthogonal motion directions simultaneously, and use their linear combination, in order to represent any displacement in the latent space. Extensive quantitative and qualitative analysis suggests that our model systematically and significantly outperforms state-of-art methods on VoxCeleb, Taichi and TED-talk datasets w.r.t. generated quality.

updated: Thu Mar 17 2022 02:45:34 GMT+0000 (UTC)

published: Thu Mar 17 2022 02:45:34 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト