Derived from rapid advances in computer vision and machine learning, video analysis tasks have been moving from inferring the present state to predicting the future state. Vision-based action recognition and prediction from videos are such tasks, where action recognition is to infer human actions (present state) based upon complete action executions, and action prediction to predict human actions (future state) based upon incomplete action executions. These two tasks have become particularly prevalent topics recently because of their explosively emerging real-world applications, such as visual surveillance, autonomous driving vehicle, entertainment, and video retrieval, etc. Many attempts have been devoted in the last a few decades in order to build a robust and effective framework for action recognition and prediction. In this paper, we survey the complete state-of-the-art techniques in action recognition and prediction. Existing models, popular algorithms, technical difficulties, popular action databases, evaluation protocols, and promising future directions are also provided with systematic discussions.
updated: Sun Feb 13 2022 04:11:52 GMT+0000 (UTC)
published: Thu Jun 28 2018 23:43:42 GMT+0000 (UTC)