Aria Digital Twin: A New Benchmark Dataset for Egocentric 3D Machine Perception

Xiaqing Pan; Nicholas Charron; Yongqian Yang; Scott Peters; Thomas Whelan; Chen Kong; Omkar Parkhi; Richard Newcombe; Carl; Ren

Aria Digital Twin: 自己中心的な 3D マシン認識のための新しいベンチマークデータセット

Aria デジタルツイン (ADT) を紹介します。これは、広範なオブジェクト、環境、人間レベルのグラウンドトゥルースを備えた Aria グラスを使用してキャプチャされた自己中心的なデータセットです。この ADT リリースには、398 個のオブジェクトインスタンス (静止 324 個と動的 74 個) を含む 2 つの実際の屋内シーンで、Aria 着用者によって実行される現実世界のアクティビティの 200 シーケンスが含まれています。各シーケンスは以下で構成されます: a) 2 つのモノクロカメラストリーム、1 つの RGB カメラストリーム、2 つの IMU ストリームの生データ。 b) センサーの校正を完了する。 c) Aria デバイスの連続 6 自由度 (6DoF) ポーズ、オブジェクト 6DoF ポーズ、3D 視線ベクトル、3D 人間のポーズ、2D 画像セグメンテーション、画像深度マップを含むグラウンドトゥルースデータ。 d) フォトリアリスティックな合成レンダリング。私たちの知る限り、ADT に匹敵するレベルの精度、フォトリアリズム、包括性を備えた自己中心的なデータセットは存在しません。 ADT を研究コミュニティに貢献することで、私たちの使命は、3D オブジェクトの検出と追跡、シーンの再構成と理解、シミュレーションからリアルへの学習などの非常に困難な研究課題を含む、自己中心的な機械認識領域における評価の新しい標準を確立することです。、人間の姿勢予測 - 同時に、拡張現実 (AR) アプリケーションのための新しい機械認識タスクも刺激します。 ADT 研究のユースケースの調査を開始するために、ベンチマークデータセットとしての ADT の有用性を実証する、物体検出、セグメンテーション、画像変換タスクのためのいくつかの既存の最先端の手法を評価しました。

We introduce the Aria Digital Twin (ADT) - an egocentric dataset captured using Aria glasses with extensive object, environment, and human level ground truth. This ADT release contains 200 sequences of real-world activities conducted by Aria wearers in two real indoor scenes with 398 object instances (324 stationary and 74 dynamic). Each sequence consists of: a) raw data of two monochrome camera streams, one RGB camera stream, two IMU streams; b) complete sensor calibration; c) ground truth data including continuous 6-degree-of-freedom (6DoF) poses of the Aria devices, object 6DoF poses, 3D eye gaze vectors, 3D human poses, 2D image segmentations, image depth maps; and d) photo-realistic synthetic renderings. To the best of our knowledge, there is no existing egocentric dataset with a level of accuracy, photo-realism and comprehensiveness comparable to ADT. By contributing ADT to the research community, our mission is to set a new standard for evaluation in the egocentric machine perception domain, which includes very challenging research problems such as 3D object detection and tracking, scene reconstruction and understanding, sim-to-real learning, human pose prediction - while also inspiring new machine perception tasks for augmented reality (AR) applications. To kick start exploration of the ADT research use cases, we evaluated several existing state-of-the-art methods for object detection, segmentation and image translation tasks that demonstrate the usefulness of ADT as a benchmarking dataset.

updated: Sat Jun 10 2023 06:46:32 GMT+0000 (UTC)

published: Sat Jun 10 2023 06:46:32 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト