HUMAN4D: A Human-Centric Multimodal Dataset for Motions and Immersive Media

nargyros Chatzitofis; Leonidas Saroglou; Prodromos Boutis; Petros Drakoulis; Nikolaos Zioulis; Shishir Subramanyam; Bart Kevelham; Caecilia Charbonnier; Pablo Cesar; Dimitrios Zarpalas; Stefanos Kollias; Petros Daras

HUMAN4D：モーションおよび没入型メディア用の人間中心のマルチモーダルデータセット

HUMAN4Dは、プロのマーカーベースのモーションキャプチャ、ボリュームキャプチャ、オーディオ録音システムによって同時にキャプチャされたさまざまな人間の活動を含む、大規模でマルチモーダルな4Dデータセットです。 HUMAN4Dは、さまざまな全身の動きや表情を演じる2人の女性と2人の男性のプロの俳優を撮影することで、1人または複数人の日常の身体的および社会的活動（ジャンプ、ダンスなど）の一部として遭遇するさまざまなモーションとポーズのセットを提供します。）、マルチRGBD（mRGBD）、ボリュームおよびオーディオデータとともに。ハードウェア（HW）同期を使用してキャプチャされたマルチビューカラーデータセットの存在にもかかわらず、私たちの知る限り、HUMAN4Dは、イントラを使用することにより、高い同期精度でボリューム深度マップを提供する最初で唯一のパブリックリソースです。 -およびセンサー間HW-SYNC。さらに、時空間的に位置合わせされたスキャンおよびリグされた3Dキャラクターは、HUMAN4Dを補完し、時変で高品質の動的メッシュに関する共同研究を可能にします。 HUMAN4Dを最先端の人間の姿勢推定と3D圧縮方法でベンチマークすることにより、評価ベースラインを提供します。前者の場合、シングルビューとマルチビューの両方のデータキューに2Dおよび3Dポーズ推定アルゴリズムを適用します。後者の場合、オンラインのボリュームビデオエンコーディングと安定したビットレートを考慮して、ボリュームデータのオープンソース3Dコーデックのベンチマークを行います。さらに、異なる品質で再構築されたメッシュベースのボリュームデータ間の定性的および定量的な視覚的比較は、4D表現に関して利用可能なオプションを示しています。 HUMAN4Dは、コンピュータービジョンおよびグラフィックスの研究コミュニティに導入され、時空間的に位置合わせされたポーズ、ボリューム、mRGBD、およびオーディオデータキューに関する共同研究を可能にします。データセットとそのコードはhttps://tofis.github.io/myurls/human4dで入手できます。

We introduce HUMAN4D, a large and multimodal 4D dataset that contains a variety of human activities simultaneously captured by a professional marker-based MoCap, a volumetric capture and an audio recording system. By capturing 2 female and 2 male professional actors performing various full-body movements and expressions, HUMAN4D provides a diverse set of motions and poses encountered as part of single- and multi-person daily, physical and social activities (jumping, dancing, etc.), along with multi-RGBD (mRGBD), volumetric and audio data. Despite the existence of multi-view color datasets captured with the use of hardware (HW) synchronization, to the best of our knowledge, HUMAN4D is the first and only public resource that provides volumetric depth maps with high synchronization precision due to the use of intra- and inter-sensor HW-SYNC. Moreover, a spatio-temporally aligned scanned and rigged 3D character complements HUMAN4D to enable joint research on time-varying and high-quality dynamic meshes. We provide evaluation baselines by benchmarking HUMAN4D with state-of-the-art human pose estimation and 3D compression methods. For the former, we apply 2D and 3D pose estimation algorithms both on single- and multi-view data cues. For the latter, we benchmark open-source 3D codecs on volumetric data respecting online volumetric video encoding and steady bit-rates. Furthermore, qualitative and quantitative visual comparison between mesh-based volumetric data reconstructed in different qualities showcases the available options with respect to 4D representations. HUMAN4D is introduced to the computer vision and graphics research communities to enable joint research on spatio-temporally aligned pose, volumetric, mRGBD and audio data cues. The dataset and its code are available https://tofis.github.io/myurls/human4d.

updated: Thu Oct 14 2021 09:03:35 GMT+0000 (UTC)

published: Thu Oct 14 2021 09:03:35 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト