Optical Flow Estimation in 360^∘ Videos: Dataset, Model and Application

Bin Duan; Keshav Bhandari; Gaowen Liu; Yan Yan

360^∘ ビデオでのオプティカルフローの推定: データセット、モデル、およびアプリケーション

オプティカルフローの推定は、コンピュータービジョンコミュニティで長く続いている基本的な問題です。ただし、パースペクティブビデオでのオプティカルフロー推定の進歩にもかかわらず、主にベンチマークデータセットの不足と 360^∘ ビデオの全方向性に対応できていないため、対応する 360^∘ ビデオは初期段階にとどまっています。 40 の異なるビデオと 4,000 のビデオフレームを含む、最初の知覚的にリアルな 360^∘ フィールドオブビュービデオベンチマークデータセット、つまり FLOW360 を提案します。次に、包括的な特性分析と既存のデータセットとの広範な比較を行い、FLOW360 の知覚的リアリズム、独自性、および多様性を明らかにします。さらに、全方向フロー (SLOF) 推定のための新しいシャム表現学習フレームワークを提示します。これは、シャム対比損失とオプティカルフロー損失を組み合わせたハイブリッド損失を介して対比的にトレーニングされます。入力全方向フレームのランダムな回転でモデルをトレーニングすることにより、提案された対照的なスキームは、360^∘ ビデオのオプティカルフロー推定の全方向性に対応し、予測エラーが大幅に減少します。学習スキームは、シャム学習スキームと全方向オプティカルフロー推定を自己中心的な活動認識タスクに拡張することで効率的であることがさらに証明され、分類精度が最大26％向上します。要約すると、ベンチマークデータセット、学習モデル、および実用的なアプリケーションの観点から、360^∘ ビデオ問題におけるオプティカルフロー推定を検討します。 FLOW360 のデータセットとコードは、https://siamlof.github.io で入手できます。

Optical flow estimation has been a long-lasting and fundamental problem in the computer vision community. However, despite the advances of optical flow estimation in perspective videos, the 360^∘ videos counterpart remains in its infancy, primarily due to the shortage of benchmark datasets and the failure to accommodate the omnidirectional nature of 360^∘ videos. We propose the first perceptually realistic 360^∘ filed-of-view video benchmark dataset, namely FLOW360, with 40 different videos and 4,000 video frames. We then conduct comprehensive characteristic analysis and extensive comparisons with existing datasets, manifesting FLOW360's perceptual realism, uniqueness, and diversity. Moreover, we present a novel Siamese representation Learning framework for Omnidirectional Flow (SLOF) estimation, which is trained in a contrastive manner via a hybrid loss that combines siamese contrastive and optical flow losses. By training the model on random rotations of the input omnidirectional frames, our proposed contrastive scheme accommodates the omnidirectional nature of optical flow estimation in 360^∘ videos, resulting in significantly reduced prediction errors. The learning scheme is further proven to be efficient by expanding our siamese learning scheme and omnidirectional optical flow estimation to the egocentric activity recognition task, where the classification accuracy is boosted up to ∼26%. To summarize, we study the optical flow estimation in 360^∘ videos problem from perspectives of the benchmark dataset, learning model, and also practical application. The FLOW360 dataset and code are available at https://siamlof.github.io.

updated: Fri Jan 27 2023 17:50:09 GMT+0000 (UTC)

published: Fri Jan 27 2023 17:50:09 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト