Total-Recon: Deformable Scene Reconstruction for Embodied View Synthesis

Chonghyuk Song; Gengshan Yang; Kangle Deng; Jun-Yan Zhu; Deva Ramanan

Total-Recon: 具現化されたビュー合成のための変形可能なシーンの再構築

変形可能なシーンの単眼ビデオからの具体化されたビュー合成のタスクを調査します。人々がペットとやり取りしている 1 分間の RGBD ビデオが与えられた場合、俳優のシーン内の動きから導き出された斬新なカメラの軌跡からシーンをレンダリングします。アクターを追跡する三人称カメラ。このようなシステムを構築するには、シーン内の各アクターのルートボディと関節運動の再構築、および自由視点合成をサポートするシーン表現が必要です。ビデオが長いほど、さまざまな視点からシーンをキャプチャする可能性が高くなりますが (再構築に役立ちます)、より大きなモーションが含まれる可能性も高くなります (再構築が複雑になります)。これらの課題に対処するために、Total-Recon を提示します。これは、長い単眼 RGBD ビデオから変形可能なシーンをフォトリアリスティックに再構築する最初の方法です。重要なのは、長いビデオにスケーリングするために、私たちの方法はシーンの動きを各オブジェクトの動きに階層的に分解し、それ自体がグローバルなルートボディモーションとローカルアーティキュレーションに分解されることです。このような「野生の」再構成とビュー合成を定量化するために、11 の挑戦的なビデオの特殊なステレオ RGBD キャプチャリグからグラウンドトゥルースデータを収集し、従来技術を大幅に上回りました。コード、ビデオ、およびデータは、https://andrewsonga.github.io/totalrecon にあります。

We explore the task of embodied view synthesis from monocular videos of deformable scenes. Given a minute-long RGBD video of people interacting with their pets, we render the scene from novel camera trajectories derived from in-scene motion of actors: (1) egocentric cameras that simulate the point of view of a target actor and (2) 3rd-person cameras that follow the actor. Building such a system requires reconstructing the root-body and articulated motion of each actor in the scene, as well as a scene representation that supports free-viewpoint synthesis. Longer videos are more likely to capture the scene from diverse viewpoints (which helps reconstruction) but are also more likely to contain larger motions (which complicates reconstruction). To address these challenges, we present Total-Recon, the first method to photorealistically reconstruct deformable scenes from long monocular RGBD videos. Crucially, to scale to long videos, our method hierarchically decomposes the scene motion into the motion of each object, which itself is decomposed into global root-body motion and local articulations. To quantify such "in-the-wild" reconstruction and view synthesis, we collect ground-truth data from a specialized stereo RGBD capture rig for 11 challenging videos, significantly outperforming prior art. Code, videos, and data can be found at https://andrewsonga.github.io/totalrecon .

updated: Mon Apr 24 2023 17:59:52 GMT+0000 (UTC)

published: Mon Apr 24 2023 17:59:52 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト