Disentangling Human Dynamics for Pedestrian Locomotion Forecasting with Noisy Supervision

Karttikeya Mangalam; Ehsan Adeli; Kuan-Hui Lee; Adrien Gaidon; Juan Carlos Niebles

ノイズの多い監視による歩行者の歩行予測のための人間のダイナミクスの解明

人間の運動予測の問題に取り組んでいます。これは、エゴセントリックな環境のもとで、近い将来に人体のいくつかのキーポイントの空間位置を共同で予測するタスクです。姿勢予測または軌道予測のタスクを個別に解決することを目的とする以前の研究とは対照的に、2つの問題を統合し、野生での歩行者の移動予測の実用的なタスクに取り組むためのフレームワークを提案します。この課題を解決する上での主要な課題の1つは、ポーズ、深度、またはエゴモーションに関する高密度のアノテーションを使用した、アノテーション付きの自己中心的なビデオデータセットの不足です。この困難を克服するために、最新のモデルを使用して（ノイズの多い）注釈を生成し、このノイズの多い監視から学習できる堅牢な予測モデルを提案します。ポーズの完了と分解モジュールを利用して、歩行者の動き全体をより簡単に学習できるサブパートに分解する方法を紹介します。完了モジュールは欠落しているキーポイントアノテーションを埋め、分解モジュールはクリーンな歩行運動をグローバル（軌道）とローカル（キーポイントの動きのポーズ）に分解します。さらに、Quasi RNNをバックボーンとして、エゴモーションや深度などの低レベルのビジョンドメイン固有の信号を利用してグローバルな軌道を予測する、新しい階層型軌道予測ネットワークを提案します。私たちの方法は、自己中心的な視点で人間の運動を予測するための最先端の結果につながります。プロジェクトpade：https://karttikeya.github.io/publication/plf/

We tackle the problem of Human Locomotion Forecasting, a task for jointly predicting the spatial positions of several keypoints on the human body in the near future under an egocentric setting. In contrast to the previous work that aims to solve either the task of pose prediction or trajectory forecasting in isolation, we propose a framework to unify the two problems and address the practically useful task of pedestrian locomotion prediction in the wild. Among the major challenges in solving this task is the scarcity of annotated egocentric video datasets with dense annotations for pose, depth, or egomotion. To surmount this difficulty, we use state-of-the-art models to generate (noisy) annotations and propose robust forecasting models that can learn from this noisy supervision. We present a method to disentangle the overall pedestrian motion into easier to learn subparts by utilizing a pose completion and a decomposition module. The completion module fills in the missing key-point annotations and the decomposition module breaks the cleaned locomotion down to global (trajectory) and local (pose keypoint movements). Further, with Quasi RNN as our backbone, we propose a novel hierarchical trajectory forecasting network that utilizes low-level vision domain specific signals like egomotion and depth to predict the global trajectory. Our method leads to state-of-the-art results for the prediction of human locomotion in the egocentric view. Project pade: https://karttikeya.github.io/publication/plf/

updated: Mon Apr 13 2020 19:33:42 GMT+0000 (UTC)

published: Mon Nov 04 2019 11:30:12 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト