Dimensions of Motion: Monocular Prediction through Flow Subspaces

Richard Strong Bowen; Richard Tucker; Ramin Zabih; Noah Snavely

運動の次元: フロー部分空間による単眼予測

トレーニング例ごとにオプティカルフローの低次元部分空間を予測することで、単一の画像からシーン表現を推定する方法を紹介します。これには、考えられるさまざまなカメラとオブジェクトの動きが含まれます。監視は、この予測されたフロー部分空間と観測されたオプティカルフローとの間の距離を測定する新しい損失によって提供されます。これにより、単眼深度予測やインスタンスセグメンテーションなどのシーン表現タスクを、カメラポーズ、組み込み関数、または明示的なマルチビューステレオステップを必要とせずに、野生の入力ビデオを使用して教師なしで学習するための新しいアプローチが提供されます。より多くの監督で訓練された最近の方法に匹敵するパフォーマンスを達成する屋内深度予測タスクを含む、複数の設定で私たちの方法を評価します。

We introduce a way to learn to estimate a scene representation from a single image by predicting a low-dimensional subspace of optical flow for each training example, which encompasses the variety of possible camera and object movement. Supervision is provided by a novel loss which measures the distance between this predicted flow subspace and an observed optical flow. This provides a new approach to learning scene representation tasks, such as monocular depth prediction or instance segmentation, in an unsupervised fashion using in-the-wild input videos without requiring camera poses, intrinsics, or an explicit multi-view stereo step. We evaluate our method in multiple settings, including an indoor depth prediction task where it achieves comparable performance to recent methods trained with more supervision.

updated: Thu Aug 11 2022 19:11:40 GMT+0000 (UTC)

published: Thu Dec 02 2021 18:52:54 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト