Revisiting Hierarchical Approach for Persistent Long-Term Video Prediction

Wonkwang Lee; Whie Jung; Han Zhang; Ting Chen; Jing Yu Koh; Thomas Huang; Hyungsuk Yoon; Honglak Lee; Seunghoon Hong

永続的な長期ビデオ予測のための階層的アプローチの再検討

ビデオフレームの長期的な将来を予測することを学ぶことは、遠い将来の固有のあいまいさと時間の経過に伴う予測誤差の劇的な増幅のために、悪名高い挑戦です。文献の最近の進歩にもかかわらず、既存のアプローチは適度に短期的な予測（数秒未満）に制限されており、それをより長い未来に外挿すると、構造と内容がすぐに破壊されます。この作業では、ビデオ予測の階層モデルを再検討します。私たちの方法は、最初に意味構造のシーケンスを推定し、続いてビデオからビデオへの変換によって構造をピクセルに変換することによって、将来のフレームを予測します。単純さにもかかわらず、確率的反復推定量を使用した離散意味構造空間での構造とそのダイナミクスのモデリングが、驚くほど成功した長期予測につながることを示します。車の運転と人間のダンスを含む3つの挑戦的なデータセットでこの方法を評価し、非常に長い期間（つまり、数千フレーム）にわたって複雑なシーン構造とモーションを生成できることを示し、桁違いのビデオ予測の新しい標準を設定します。既存のアプローチよりも長い予測時間。完全なビデオとコードはhttps://1konny.github.io/HVP/で入手できます。

Learning to predict the long-term future of video frames is notoriously challenging due to inherent ambiguities in the distant future and dramatic amplifications of prediction error through time. Despite the recent advances in the literature, existing approaches are limited to moderately short-term prediction (less than a few seconds), while extrapolating it to a longer future quickly leads to destruction in structure and content. In this work, we revisit hierarchical models in video prediction. Our method predicts future frames by first estimating a sequence of semantic structures and subsequently translating the structures to pixels by video-to-video translation. Despite the simplicity, we show that modeling structures and their dynamics in the discrete semantic structure space with a stochastic recurrent estimator leads to surprisingly successful long-term prediction. We evaluate our method on three challenging datasets involving car driving and human dancing, and demonstrate that it can generate complicated scene structures and motions over a very long time horizon (i.e., thousands frames), setting a new standard of video prediction with orders of magnitude longer prediction time than existing approaches. Full videos and codes are available at https://1konny.github.io/HVP/.

updated: Wed Apr 14 2021 08:39:38 GMT+0000 (UTC)

published: Wed Apr 14 2021 08:39:38 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト