MultiSports: A Multi-Person Video Dataset of Spatio-Temporally Localized Sports Actions

Yixuan Li; Lei Chen; Runyu He; Zhenzhi Wang; Gangshan Wu; Limin Wang

MultiSports：時空間的にローカライズされたスポーツアクションのマルチパーソンビデオデータセット

時空間アクションの検出は、ビデオの理解において重要で挑戦的な問題です。既存のアクション検出ベンチマークは、トリミングされたビデオまたは低レベルのアトミックアクション内の少数のインスタンスの側面で制限されています。このペーパーは、MultiSportsとして造られた、時空間的にローカライズされたスポーツアクションの新しい複数人のデータセットを提示することを目的としています。まず、3つの基準を提案することにより、時空間アクション検出のための現実的でやりがいのあるデータセットを構築するための重要な要素を分析します。 -複雑度の高いグレインクラス。これらのガイドラインに基づいて、4つのスポーツクラスを選択し、3200のビデオクリップを収集し、37701のアクションインスタンスに902kの境界ボックスで注釈を付けることにより、MultiSportsv1.0のデータセットを構築します。私たちのデータセットは、多様性が高く、注釈が密で、品質が高いという重要な特性を備えています。リアルな設定と詳細な注釈を備えたマルチスポーツは、時空間アクション検出の本質的な課題を明らかにします。これをベンチマークするために、いくつかのベースラインメソッドをデータセットに適合させ、データセット内のアクション検出結果に関する詳細な分析を行います。 MultiSportsが、将来、時空間アクション検出の標準ベンチマークとして機能することを願っています。データセットのウェブサイトはhttps://deeperaction.github.io/multisports/です。

Spatio-temporal action detection is an important and challenging problem in video understanding. The existing action detection benchmarks are limited in aspects of small numbers of instances in a trimmed video or low-level atomic actions. This paper aims to present a new multi-person dataset of spatio-temporal localized sports actions, coined as MultiSports. We first analyze the important ingredients of constructing a realistic and challenging dataset for spatio-temporal action detection by proposing three criteria: (1) multi-person scenes and motion dependent identification, (2) with well-defined boundaries, (3) relatively fine-grained classes of high complexity. Based on these guide-lines, we build the dataset of MultiSports v1.0 by selecting 4 sports classes, collecting 3200 video clips, and annotating 37701 action instances with 902k bounding boxes. Our datasets are characterized with important properties of high diversity, dense annotation, and high quality. Our Multi-Sports, with its realistic setting and detailed annotations, exposes the intrinsic challenges of spatio-temporal action detection. To benchmark this, we adapt several baseline methods to our dataset and give an in-depth analysis on the action detection results in our dataset. We hope our MultiSports can serve as a standard benchmark for spatio-temporal action detection in the future. Our dataset website is at https://deeperaction.github.io/multisports/.

updated: Wed Aug 18 2021 05:27:50 GMT+0000 (UTC)

published: Sun May 16 2021 10:40:30 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト