arXiv reaDer
Real-time 3D human action recognition based on Hyperpoint sequence
Real-time 3D human action recognition has broad industrial applications, such as surveillance, human-computer interaction, and healthcare monitoring. By relying on complex spatio-temporal local encoding, most existing point cloud sequence networks capture spatio-temporal local structures to recognize 3D human actions. To simplify the point cloud sequence modeling task, we propose a lightweight and effective point cloud sequence network referred to as SequentialPointNet for real-time 3D action recognition. Instead of capturing spatio-temporal local structures, SequentialPointNet encodes the temporal evolution of static appearances to recognize human actions. Firstly, we define a novel type of point data, Hyperpoint, to better describe the temporally changing human appearances. A theoretical foundation is provided to clarify the information equivalence property for converting point cloud sequences into Hyperpoint sequences. Secondly, the point cloud sequence modeling task is decomposed into a Hyperpoint embedding task and a Hyperpoint sequence modeling task. Specifically, for Hyperpoint embedding, the static point cloud technology is employed to convert point cloud sequences into Hyperpoint sequences, which introduces inherent frame-level parallelism; for Hyperpoint sequence modeling, a Hyperpoint-Mixer module is designed as the basic building block to learning the spatio-temporal features of human actions. Extensive experiments on three widely-used 3D action recognition datasets demonstrate that the proposed SequentialPointNet achieves competitive classification performance with up to 10X faster than existing approaches.
updated: Mon Feb 26 2024 08:48:08 GMT+0000 (UTC)
published: Tue Nov 16 2021 14:13:32 GMT+0000 (UTC)
参考文献 (このサイトで利用可能なもの) / References (only if available on this site)
被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)
Amazon.co.jpアソシエイト