Modelling Latent Dynamics of StyleGAN using Neural ODEs

Weihao Xia; Yujiu Yang; Jing-Hao Xue

ニューラル ODE を使用した StyleGAN の潜在ダイナミクスのモデル化

この論文では、GANから独立して反転された潜在コードの軌跡を学習することにより、ビデオダイナミクスをモデル化することを提案します。各潜在コードを移動粒子と見なし、潜在空間を高次元の動的システムと見なすことにより、シーケンス全体が初期潜在コードの連続軌跡の離散時間観測と見なされます。したがって、異なるフレームを表す潜在コードは、ニューラル常微分方程式によってモデル化できる初期フレームの状態遷移として再定式化されます。学習した連続軌跡により、無限のフレーム補間と一貫したビデオ操作を実行できます。後者のタスクは、すべてのフレームで時間的な一貫性を維持しながら、コア操作を最初のフレームにのみ適用する必要があるという利点を備えたビデオ編集用に再導入されています。広範な実験により、私たちの方法が最先端のパフォーマンスを達成するが、計算量がはるかに少ないことが実証されています。コードは https://github.com/weihaox/dynode_released で入手できます。

In this paper, we propose to model the video dynamics by learning the trajectory of independently inverted latent codes from GANs. The entire sequence is seen as discrete-time observations of a continuous trajectory of the initial latent code, by considering each latent code as a moving particle and the latent space as a high-dimensional dynamic system. The latent codes representing different frames are therefore reformulated as state transitions of the initial frame, which can be modeled by neural ordinary differential equations. The learned continuous trajectory allows us to perform infinite frame interpolation and consistent video manipulation. The latter task is reintroduced for video editing with the advantage of requiring the core operations to be applied to the first frame only while maintaining temporal consistency across all frames. Extensive experiments demonstrate that our method achieves state-of-the-art performance but with much less computation. Code is available at https://github.com/weihaox/dynode_released.

updated: Sat Apr 22 2023 20:18:14 GMT+0000 (UTC)

published: Tue Aug 23 2022 21:20:38 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト