A Deep Temporal Fusion Framework for Scene Flow Using a Learnable Motion Model and Occlusions

René Schuster; Christian Unger; Didier Stricker

学習可能なモーションモデルとオクルージョンを使用したシーンフローの深側頭融合フレームワーク

動き推定は、コンピュータビジョンの主要な課題の1つです。従来のデュアルフレームアプローチでは、特にオブジェクトの大きな（自我）動きによる車両の環境認識のコンテキストでは、オクルージョンと視界外の動きが制限要因になります。私たちの仕事は、オクルージョンの問題を克服するために、マルチフレームセットアップでシーンフロー推定値を時間的に融合するための新しいデータ駆動型アプローチを提案します。以前のほとんどの方法とは異なり、一定の運動モデルに依存せず、代わりにデータから運動の一般的な時間的関係を学習します。 2番目のステップでは、ニューラルネットワークが共通の参照フレームからの双方向のシーンフロー推定値を組み合わせて、洗練された推定値とオクルージョンマスクの自然な副産物を生成します。このように、私たちのアプローチは、さまざまなシーンフロー推定器に高速なマルチフレーム拡張を提供します。これは、基盤となるデュアルフレームアプローチよりも優れています。

Motion estimation is one of the core challenges in computer vision. With traditional dual-frame approaches, occlusions and out-of-view motions are a limiting factor, especially in the context of environmental perception for vehicles due to the large (ego-) motion of objects. Our work proposes a novel data-driven approach for temporal fusion of scene flow estimates in a multi-frame setup to overcome the issue of occlusion. Contrary to most previous methods, we do not rely on a constant motion model, but instead learn a generic temporal relation of motion from data. In a second step, a neural network combines bi-directional scene flow estimates from a common reference frame, yielding a refined estimate and a natural byproduct of occlusion masks. This way, our approach provides a fast multi-frame extension for a variety of scene flow estimators, which outperforms the underlying dual-frame approaches.

updated: Tue Nov 03 2020 10:14:11 GMT+0000 (UTC)

published: Tue Nov 03 2020 10:14:11 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト