The IKEA ASM Dataset: Understanding People Assembling Furniture through Actions, Objects and Pose

Yizhak Ben-Shabat; Xin Yu; Fatemeh Sadat Saleh; Dylan Campbell; Cristian Rodriguez-Opazo; Hongdong Li; Stephen Gould

IKEA ASM データセット: アクション、オブジェクト、ポーズを通じて家具を組み立てる人々を理解する

大規模なラベル付きデータセットを利用できることは、深層学習手法を適用してさまざまなコンピュータービジョンタスクを解決するための重要な要件です。人間の活動を理解するという観点から見ると、既存の公開データセットはサイズは大きいものの、多くの場合 1 台の RGB カメラに限定されており、フレームごとまたはクリップごとのアクションアノテーションのみが提供されます。人間の活動のより豊かな分析と理解を可能にするために、IKEA ASM を導入します。IKEA ASM は、奥行き、アトミックアクション、オブジェクトのセグメンテーション、人間のポーズを含む 300 万フレーム、マルチビュー、家具組み立てビデオデータセットです。さらに、この困難なデータセットを使用して、ビデオ動作認識、オブジェクトのセグメンテーション、人間の姿勢推定タスクの優れた手法のベンチマークを行います。このデータセットにより、マルチモーダルデータとマルチビューデータを統合してこれらのタスクをより適切に実行する総合的な手法の開発が可能になります。

The availability of a large labeled dataset is a key requirement for applying deep learning methods to solve various computer vision tasks. In the context of understanding human activities, existing public datasets, while large in size, are often limited to a single RGB camera and provide only per-frame or per-clip action annotations. To enable richer analysis and understanding of human activities, we introduce IKEA ASM -- a three million frame, multi-view, furniture assembly video dataset that includes depth, atomic actions, object segmentation, and human pose. Additionally, we benchmark prominent methods for video action recognition, object segmentation and human pose estimation tasks on this challenging dataset. The dataset enables the development of holistic methods, which integrate multi-modal and multi-view data to better perform on these tasks.

updated: Wed May 17 2023 07:56:52 GMT+0000 (UTC)

published: Wed Jul 01 2020 11:34:46 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト