Leveraging Third-Order Features in Skeleton-Based Action Recognition

Zhenyue Qin; Yang Liu; Pan Ji; Dongwoo Kim; Lei Wang; RI; McKay; Saeed Anwar; Tom Gedeon

スケルトンベースのアクション認識における3次機能の活用

スケルトンシーケンスは軽量でコンパクトであるため、エッジデバイスでのアクション認識の理想的な候補です。最近のスケルトンベースのアクション認識方法は、3D関節座標から特徴を時空間キューとして抽出し、特徴融合のためにグラフニューラルネットワークでこれらの表現を使用して、認識パフォーマンスを向上させます。一次および二次の特徴、すなわち関節および骨の表現の使用は高精度をもたらしましたが、多くのモデルは依然として同様の運動軌道を持つアクションによって混乱しています。これらの問題に対処するために、関節と身体部分の関係をしっかりと捉えるために、角度の形で3次の特徴を現代建築に融合することを提案します。人気のある時空間グラフニューラルネットワークとのこの単純な融合により、NTU60とNTU120を含む2つの大きなベンチマークで新しい最先端の精度が達成され、使用するパラメーターが少なくなり、実行時間が短縮されます。ソースコードはhttps://github.com/ZhenyueQin/Angular-Skeleton-Encodingで公開されています。

Skeleton sequences are light-weight and compact, and thus ideal candidates for action recognition on edge devices. Recent skeleton-based action recognition methods extract features from 3D joint coordinates as spatial-temporal cues, using these representations in a graph neural network for feature fusion, to boost recognition performance. The use of first- and second-order features, i.e., joint and bone representations has led to high accuracy, but many models are still confused by actions that have similar motion trajectories. To address these issues, we propose fusing third-order features in the form of angles into modern architectures, to robustly capture the relationships between joints and body parts. This simple fusion with popular spatial-temporal graph neural networks achieves new state-of-the-art accuracy in two large benchmarks, including NTU60 and NTU120, while employing fewer parameters and reduced run time. Our sourcecode is publicly available at: https://github.com/ZhenyueQin/Angular-Skeleton-Encoding.

updated: Tue May 04 2021 15:23:29 GMT+0000 (UTC)

published: Tue May 04 2021 15:23:29 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト