VTP: Volumetric Transformer for Multi-view Multi-person 3D Pose Estimation

Yuxing Chen; Renshu Gu; Ouhan Huang; Gangyong Jia

VTP：マルチビューマルチパーソン3Dポーズ推定用のボリュームトランスフォーマー

この論文では、マルチビューマルチパーソン3D人間ポーズ推定のための最初の3Dボリュームトランスフレームフレームであるボリュームトランスポーズ推定器（VTP）を紹介します。 VTPは、すべてのカメラビューの2Dキーポイントから機能を集約し、3Dボクセル空間の空間的関係をエンドツーエンドで直接学習します。集約された3Dフィーチャーは、3D畳み込みを通過してから、フラット化されて順次埋め込みになり、トランスフォーマーに送られます。残差構造は、パフォーマンスをさらに向上させるように設計されています。さらに、Sinkhornの注意がまばらであるため、ボリューム表現の主要なボトルネックであるメモリコストを削減すると同時に、優れたパフォーマンスを実現できます。トランスの出力は、残差設計によって3D畳み込み特徴と再び連結されます。提案されたVTPフレームワークは、トランスフォーマーの高性能をボリューム表現と統合します。これは、畳み込みバックボーンの優れた代替手段として使用できます。棚、キャンパス、およびCMU Panopticベンチマークでの実験は、関節あたりの平均位置誤差（MPJPE）と正しく推定された部品のパーセンテージ（PCP）の両方の点で有望な結果を示しています。私たちのコードが利用可能になります。

This paper presents Volumetric Transformer Pose estimator (VTP), the first 3D volumetric transformer framework for multi-view multi-person 3D human pose estimation. VTP aggregates features from 2D keypoints in all camera views and directly learns the spatial relationships in the 3D voxel space in an end-to-end fashion. The aggregated 3D features are passed through 3D convolutions before being flattened into sequential embeddings and fed into a transformer. A residual structure is designed to further improve the performance. In addition, the sparse Sinkhorn attention is empowered to reduce the memory cost, which is a major bottleneck for volumetric representations, while also achieving excellent performance. The output of the transformer is again concatenated with 3D convolutional features by a residual design. The proposed VTP framework integrates the high performance of the transformer with volumetric representations, which can be used as a good alternative to the convolutional backbones. Experiments on the Shelf, Campus and CMU Panoptic benchmarks show promising results in terms of both Mean Per Joint Position Error (MPJPE) and Percentage of Correctly estimated Parts (PCP). Our code will be available.

updated: Wed May 25 2022 09:26:42 GMT+0000 (UTC)

published: Wed May 25 2022 09:26:42 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト