VidEdit: Zero-Shot and Spatially Aware Text-Driven Video Editing

Paul Couairon; Clément Rambour; Jean-Emmanuel Haugeard; Nicolas Thome

VidEdit: ゼロショットおよび空間認識テキスト駆動ビデオ編集

最近、拡散ベースの生成モデルは、画像の生成と編集において目覚ましい成功を収めています。ただし、ビデオ編集での使用には依然として重要な制限があります。この文書では、強力な時間的および空間的一貫性を保証する、ゼロショットテキストベースのビデオ編集のための新しい方法である VidEdit を紹介します。まず、アトラスベースの事前トレーニング済みのテキストから画像への拡散モデルを組み合わせて、トレーニング不要で効率的な編集方法を提供し、設計により時間的な滑らかさを実現することを提案します。次に、既製のパノプティックセグメンタとエッジ検出器を活用し、それらの使用を条件付き拡散ベースのアトラス編集に適応させます。これにより、元のビデオの構造を厳密に保持しながら、ターゲット領域の空間を細かく制御できます。定量的および定性的な実験により、VidEdit は、セマンティックの忠実性、画像の保存、時間的一貫性のメトリクスに関して、DAVIS データセットに対する最先端の手法よりも優れたパフォーマンスを発揮することが示されています。このフレームワークを使用すると、1 つのビデオの処理にかかる時間はわずか約 1 分で、独自のテキストプロンプトに基づいて互換性のある複数の編集を生成できます。プロジェクトの Web ページ (https://videdit.github.io)

Recently, diffusion-based generative models have achieved remarkable success for image generation and edition. However, their use for video editing still faces important limitations. This paper introduces VidEdit, a novel method for zero-shot text-based video editing ensuring strong temporal and spatial consistency. Firstly, we propose to combine atlas-based and pre-trained text-to-image diffusion models to provide a training-free and efficient editing method, which by design fulfills temporal smoothness. Secondly, we leverage off-the-shelf panoptic segmenters along with edge detectors and adapt their use for conditioned diffusion-based atlas editing. This ensures a fine spatial control on targeted regions while strictly preserving the structure of the original video. Quantitative and qualitative experiments show that VidEdit outperforms state-of-the-art methods on DAVIS dataset, regarding semantic faithfulness, image preservation, and temporal consistency metrics. With this framework, processing a single video only takes approximately one minute, and it can generate multiple compatible edits based on a unique text prompt. Project web-page at https://videdit.github.io

updated: Wed Jun 14 2023 19:15:49 GMT+0000 (UTC)

published: Wed Jun 14 2023 19:15:49 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト