Graph-Based 3D Multi-Person Pose Estimation Using Multi-View Images

Size Wu; Sheng Jin; Wentao Liu; Lei Bai; Chen Qian; Dong Liu; Wanli Ouyang

マルチビュー画像を使用したグラフベースの3Dマルチパーソンポーズ推定

この論文は、複数のキャリブレーションされたカメラビューから複数の人物の3D人間のポーズを推定するタスクを研究します。トップダウンパラダイムに従って、タスクを2つの段階、つまり人物のローカリゼーションとポーズの推定に分解します。両方の段階は、粗い方法から細かい方法で処理されます。そして、効果的なメッセージパッシングのために、3つのタスク固有のグラフニューラルネットワークを提案します。 3D人物のローカリゼーションでは、最初にマルチビューマッチンググラフモジュール（MMG）を使用して、クロスビューの関連付けを学習し、粗い人間の提案を復元します。 Center Refinement Graph Module（CRG）は、柔軟なポイントベースの予測によって結果をさらに洗練します。 3Dポーズ推定の場合、ポーズ回帰グラフモジュール（PRG）は、マルチビュージオメトリと人間の関節間の構造関係の両方を学習します。私たちのアプローチは、CMU PanopticおよびShelfデータセットで最先端のパフォーマンスを実現し、計算の複雑さを大幅に軽減します。

This paper studies the task of estimating the 3D human poses of multiple persons from multiple calibrated camera views. Following the top-down paradigm, we decompose the task into two stages, i.e. person localization and pose estimation. Both stages are processed in coarse-to-fine manners. And we propose three task-specific graph neural networks for effective message passing. For 3D person localization, we first use Multi-view Matching Graph Module (MMG) to learn the cross-view association and recover coarse human proposals. The Center Refinement Graph Module (CRG) further refines the results via flexible point-based prediction. For 3D pose estimation, the Pose Regression Graph Module (PRG) learns both the multi-view geometry and structural relations between human joints. Our approach achieves state-of-the-art performance on CMU Panoptic and Shelf datasets with significantly lower computation complexity.

updated: Mon Sep 13 2021 11:44:07 GMT+0000 (UTC)

published: Mon Sep 13 2021 11:44:07 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト