Geometric Pose Affordance: 3D Human Pose with Scene Constraints

Zhe Wang; Liyan Chen; Shaurya Rathore; Daeyun Shin; Charless Fowlkes

幾何学的なポーズのアフォーダンス：シーンの制約がある3D人間のポーズ

単一の画像からの人間のポーズの完全な3D推定は、最近の多くの進歩にもかかわらず、依然として困難な作業です。この論文では、シーンのジオメトリに関する強力な事前情報を使用して、ポーズ推定の精度を向上させることができるという仮説を検討します。この質問に経験的に取り組むために、さまざまなリッチ3D環境と対話する人々のマルチビュー画像で構成される新しいGeometric PoseAffordanceデータセットを作成しました。市販のモーションキャプチャシステムを利用して、ポーズのゴールドスタンダードの推定値を収集し、シーン自体の正確な幾何学的3DCADモデルを構築しました。画像からポーズを推定するための既存のフレームワークにシーンの制約に関する事前の知識を注入するために、シーンジオメトリの新しいビューベースの表現、マルチレイヤー深度マップを導入します。これは、マルチヒットレイトレーシングを使用して複数のサーフェスエントリを簡潔にエンコードします。各カメラビュー光線方向に沿った出口点。多層深度情報ポーズ推定を統合するための2つの異なるメカニズムを提案します。2Dポーズを完全な3Dに持ち上げる際に使用されるエンコードされた光線の特徴としての入力と、学習したモデルが幾何学的に一貫したポーズ推定を優先するように促す微分可能な損失としての入力です。これらの手法により、特にオクルージョンや複雑なシーンジオメトリが存在する場合に、3Dポーズ推定の精度が向上することを実験的に示します。

Full 3D estimation of human pose from a single image remains a challenging task despite many recent advances. In this paper, we explore the hypothesis that strong prior information about scene geometry can be used to improve pose estimation accuracy. To tackle this question empirically, we have assembled a novel Geometric Pose Affordance dataset, consisting of multi-view imagery of people interacting with a variety of rich 3D environments. We utilized a commercial motion capture system to collect gold-standard estimates of pose and construct accurate geometric 3D CAD models of the scene itself. To inject prior knowledge of scene constraints into existing frameworks for pose estimation from images, we introduce a novel, view-based representation of scene geometry, a multi-layer depth map, which employs multi-hit ray tracing to concisely encode multiple surface entry and exit points along each camera view ray direction. We propose two different mechanisms for integrating multi-layer depth information pose estimation: input as encoded ray features used in lifting 2D pose to full 3D, and secondly as a differentiable loss that encourages learned models to favor geometrically consistent pose estimates. We show experimentally that these techniques can improve the accuracy of 3D pose estimates, particularly in the presence of occlusion and complex scene geometry.

updated: Thu Dec 09 2021 01:20:45 GMT+0000 (UTC)

published: Sun May 19 2019 10:04:52 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト