Probabilistic Tracking with Deep Factors

Fan Jiang; Andrew Marmon; Ildebrando De Courten; Marc Rasi; Frank Dellaert

深い要因による確率的追跡

コンピュータビジョンの多くのアプリケーションでは、2Dおよび3D画像が1つしかない多数のソースからのデータを融合することにより、時間の経過に伴うオブジェクトの軌道を正確に推定することが重要です。この論文では、因子グラフベースの確率的追跡フレームワークの特徴に対して生成密度と組み合わせて深い特徴エンコーディングを使用する方法を示します。学習した特徴エンコーダーとそれらの生成密度を組み合わせた尤度モデルを提示します。どちらも教師ありの方法でトレーニングされています。また、尤度の定式化にフィードする画像分類モデルを使用して、確率を直接推測する実験も行います。これらのモデルは、モーションモデルやその他の事前情報など、ドメイン固有の知識を表す他の因子を補完するために因子グラフに追加される深い因子を実装するために使用されます。次に、因子は、ガウス事前分布を使用した拡張カルマンスムーザーの形式をとる非線形最小二乗追跡フレームワークで一緒に最適化されます。尤度モデルの重要な機能は、追跡対象のポーズのリー群プロパティを利用して、空間トランスフォーマーネットワークに触発された微分可能なワープ関数によって抽出された画像パッチに特徴エンコーディングを適用することです。提案されたアプローチを説明するために、挑戦的な社会性昆虫の行動データセットでそれを評価し、深い特徴を使用すると、この設定で使用されるこれらの以前の線形外観モデルよりも優れていることを示します。

In many applications of computer vision it is important to accurately estimate the trajectory of an object over time by fusing data from a number of sources, of which 2D and 3D imagery is only one. In this paper, we show how to use a deep feature encoding in conjunction with generative densities over the features in a factor-graph based, probabilistic tracking framework. We present a likelihood model that combines a learned feature encoder with generative densities over them, both trained in a supervised manner. We also experiment with directly inferring probability through the use of image classification models that feed into the likelihood formulation. These models are used to implement deep factors that are added to the factor graph to complement other factors that represent domain-specific knowledge such as motion models and/or other prior information. Factors are then optimized together in a non-linear least-squares tracking framework that takes the form of an Extended Kalman Smoother with a Gaussian prior. A key feature of our likelihood model is that it leverages the Lie group properties of the tracked target's pose to apply the feature encoding on an image patch, extracted through a differentiable warp function inspired by spatial transformer networks. To illustrate the proposed approach we evaluate it on a challenging social insect behavior dataset, and show that using deep features does outperform these earlier linear appearance models used in this setting.

updated: Thu Dec 02 2021 21:31:51 GMT+0000 (UTC)

published: Thu Dec 02 2021 21:31:51 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト