Self-Supervised 3D Human Pose Estimation with Multiple-View Geometry

Arij Bouazizi; Julian Wiederer; Ulrich Kressel; Vasileios Belagiannis

マルチビュージオメトリを使用した自己監視3D人間の姿勢推定

マルチビューカメラシステムと各ビューの2D身体ポーズ推定に基づいて、1人の人物の3D人間ポーズ推定のための自己教師あり学習アルゴリズムを提示します。ディープニューラルネットワークで表されるモデルをトレーニングするために、2Dまたは3Dのボディポーズのグラウンドトゥルースを必要としない4損失関数学習アルゴリズムを提案します。提案された損失関数は、マルチビュージオメトリを利用して、3Dの体のポーズの推定値を再構築し、カメラビュー全体に体のポーズの制約を課します。私たちのアプローチは、トレーニング中に利用可能なすべてのカメラビューを利用しますが、推論はシングルビューです。私たちの評価では、Human3.6MおよびHumanEvaベンチマークで有望なパフォーマンスを示し、MPI-INF-3DHPデータセットの一般化研究といくつかのアブレーション結果も示しています。全体として、私たちはすべての自己教師あり学習方法を上回り、教師ありおよび弱教師あり学習アプローチと同等の結果に到達します。私たちのコードとモデルは公開されています

We present a self-supervised learning algorithm for 3D human pose estimation of a single person based on a multiple-view camera system and 2D body pose estimates for each view. To train our model, represented by a deep neural network, we propose a four-loss function learning algorithm, which does not require any 2D or 3D body pose ground-truth. The proposed loss functions make use of the multiple-view geometry to reconstruct 3D body pose estimates and impose body pose constraints across the camera views. Our approach utilizes all available camera views during training, while the inference is single-view. In our evaluations, we show promising performance on Human3.6M and HumanEva benchmarks, while we also present a generalization study on MPI-INF-3DHP dataset, as well as several ablation results. Overall, we outperform all self-supervised learning methods and reach comparable results to supervised and weakly-supervised learning approaches. Our code and models are publicly available

updated: Tue Aug 17 2021 17:31:24 GMT+0000 (UTC)

published: Tue Aug 17 2021 17:31:24 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト