Human in Events: A Large-Scale Benchmark for Human-centric Video Analysis in Complex Events

Weiyao Lin; Huabin Liu; Shizhan Liu; Yuxi Li; Rui Qian; Tao Wang; Ning Xu; Hongkai Xiong; Guo-Jun Qi; Nicu Sebe

ヒューマンインイベント：複雑なイベントにおけるヒューマンセントリックビデオ分析の大規模ベンチマーク

現代のスマートシティの開発に伴い、人間中心のビデオ分析は、実際のシーンで多様で複雑なイベントを分析するという課題に直面しています。複雑なイベントは、密集した群衆、異常な、または集団行動に関連しています。ただし、既存のビデオデータセットの規模によって制限されるため、このような複雑なイベントでのパフォーマンスを報告している人間の分析アプローチはほとんどありません。この目的のために、Human-in-EventsまたはHiEve（複雑なイベントでの人間中心のビデオ分析）という名前の新しい大規模データセットを提示し、さまざまな現実的なイベントでの人間の動き、ポーズ、およびアクションを理解します。特に群衆や複雑なイベントで。これには、記録的な数のポーズ（> 1M）、複雑なイベントでの最大数のアクションインスタンス（> 56k）、および長時間持続する最大数の軌道の1つ（平均軌道長> 480フレーム）が含まれています。）。このデータセットに基づいて、アクション情報の可能性を利用して、より強力な2Dポーズ機能の学習をガイドすることにより、強化されたポーズ推定ベースラインを提示します。提案された方法が、HiEveデータセット上の既存のポーズ推定パイプラインのパフォーマンスを向上させることができることを示します。さらに、最近のビデオ分析アプローチをベースライン手法と一緒にベンチマークするための広範な実験を実施し、HiEveが人間中心のビデオ分析にとって挑戦的なデータセットであることを示しています。このデータセットは、人間中心の分析における最先端の技術の開発と複雑なイベントの理解を促進することを期待しています。データセットはhttp://humaninevents.orgで入手できます。

Along with the development of modern smart cities, human-centric video analysis has been encountering the challenge of analyzing diverse and complex events in real scenes. A complex event relates to dense crowds, anomalous, or collective behaviors. However, limited by the scale of existing video datasets, few human analysis approaches have reported their performance on such complex events. To this end, we present a new large-scale dataset, named Human-in-Events or HiEve (Human-centric video analysis in complex Events), for the understanding of human motions, poses, and actions in a variety of realistic events, especially in crowd and complex events. It contains a record number of poses (>1M), the largest number of action instances (>56k) under complex events, as well as one of the largest numbers of trajectories lasting for longer time (with an average trajectory length of >480 frames). Based on this dataset, we present an enhanced pose estimation baseline by utilizing the potential of action information to guide the learning of more powerful 2D pose features. We demonstrate that the proposed method is able to boost the performance of existing pose estimation pipelines on our HiEve dataset. Furthermore, we conduct extensive experiments to benchmark recent video analysis approaches together with our baseline methods, demonstrating that HiEve is a challenging dataset for human-centric video analysis. We expect that the dataset will advance the development of cutting-edge techniques in human-centric analysis and the understanding of complex events. The dataset is available at http://humaninevents.org

updated: Sun Mar 14 2021 06:24:52 GMT+0000 (UTC)

published: Sat May 09 2020 18:24:52 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト