Cross-View Tracking for Multi-Human 3D Pose Estimation at over 100 FPS

Long Chen; Haizhou Ai; Rui Chen; Zijie Zhuang; Shuang Liu

100FPSを超えるマルチヒューマン3Dポーズ推定のためのクロスビュートラッキング

複数の人間の3Dポーズをリアルタイムで推定することは、古典的ですが、コンピュータービジョンでは依然として困難な作業です。その主な問題は、2Dポーズのクロスビュー関連付けのあいまいさと、複数のビューに複数の人がいる場合の巨大な状態空間にあります。この論文では、複数のキャリブレーションされたカメラビューからのマルチヒューマン3Dポーズ推定のための新しいソリューションを提示します。さまざまなカメラ座標の2Dポーズを入力として受け取り、グローバル座標の正確な3Dポーズを目指します。すべてのフレームでビューのすべてのペア間で2Dポーズを最初から関連付ける以前の方法とは異なり、ビデオの時間的一貫性を利用して、2D入力を3空間で直接3Dポーズと一致させます。より具体的には、各人の3Dポーズを保持し、クロスビューマルチヒューマントラッキングを介してそれらを繰り返し更新することを提案します。この新しい定式化は、広く使用されている公開データセットで示したように、精度と効率の両方を向上させます。私たちの方法のスケーラビリティをさらに検証するために、12〜28のカメラビューを持つ新しい大規模なマルチヒューマンデータセットを提案します。私たちのソリューションは、ベルやホイッスルなしで、12台のカメラで154 FPS、28台のカメラで34 FPSを達成し、大規模な実世界のアプリケーションを処理できることを示しています。提案されたデータセットはhttps://github.com/longcw/crossview_3d_pose_trackingでリリースされています。

Estimating 3D poses of multiple humans in real-time is a classic but still challenging task in computer vision. Its major difficulty lies in the ambiguity in cross-view association of 2D poses and the huge state space when there are multiple people in multiple views. In this paper, we present a novel solution for multi-human 3D pose estimation from multiple calibrated camera views. It takes 2D poses in different camera coordinates as inputs and aims for the accurate 3D poses in the global coordinate. Unlike previous methods that associate 2D poses among all pairs of views from scratch at every frame, we exploit the temporal consistency in videos to match the 2D inputs with 3D poses directly in 3-space. More specifically, we propose to retain the 3D pose for each person and update them iteratively via the cross-view multi-human tracking. This novel formulation improves both accuracy and efficiency, as we demonstrated on widely-used public datasets. To further verify the scalability of our method, we propose a new large-scale multi-human dataset with 12 to 28 camera views. Without bells and whistles, our solution achieves 154 FPS on 12 cameras and 34 FPS on 28 cameras, indicating its ability to handle large-scale real-world applications. The proposed dataset is released at https://github.com/longcw/crossview_3d_pose_tracking.

updated: Thu Jul 29 2021 03:02:33 GMT+0000 (UTC)

published: Mon Mar 09 2020 08:54:00 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト