The human action recognition task has a wide range of application prospects and enormous economic value in many fields such as smart homes,smart cities,and security assurance.With the continuous development of artificial intelligence technology,research on human action recognition has also progressed rapidly.Human action recognition technology includes many disciplines such as image processing,machine learning,computer vision,and human-computer interaction.Traditional methods of human action recognition rely on artificial extraction of behavioral characteristics.It is difficult to ensure timeliness because of the large workload.How to reduce the complexity of the algorithm and achieve efficient recognition of human action has become an important research topic.Aiming at the problem that the traditional action recognition methods need to manually extract different features under different scenarios,we design a human action recognition method based on deep learning.This model makes use of Convolutional Neural Networks(CNN)and Recurrent Neural Network(RNN)to perform autonomous learning on input data,simplify the complexity of artificial extraction of features,and improve the accuracy of action recognition.The main work of this article is as follows:(1)Analyze and consolidate human action recognition based on traditional methods and deep learning methods to discover the superiority of deep learning methods.Traditional human action recognition relies on artificial extraction of features,and the algorithm is very complicated.In the actual scene,the problems such as the obstruction of the human body and the complicated background increase the difficulty of manually extracting features.The deep learning methods simulate the human brain's mechanism for processing visual information,and can autonomously learn the characteristics of the data,thereby reducing the complexity of the algorithm.At the same time,we found that the existing deep learning algorithms have problems in the processing of dynamic sequence information and the focus on the selection of content.(2)Design a deep learning human action recognition method with spatio-temporal networks.The dual-stream spatio-temporal network is designed by using BN-Inception and IndRNN:spatial stream convolutional neural network is used to process RGB images of video frames,and temporal flow convolutional neural network is used to process continuous optical flow images.We extract feature vectors layer-by-layer,and then use multi-layer IndRNN recurrent neural network to learn feature sequences with temporal information.Finally we get the classification label of the video clip.Experiments on the UCF101 dataset show that the deep learning model with spatio-temporal network we designed can improve the ability of human action recognition.(3)Based on the human action recognition method of space-time networks,we design a human action recognition method that integrates spatio-temporal network and attention mechanism.The visual attention mechanism is used to add weight information for the deep visual features extracted by the convolutional neural network.We output a new feature sequence that combines the weights of significant attention.Then the IndRNN recurrent neural network is used to decode this feature sequence.Finally we get the classification label of each video clip.Experiments on the UCF101 dataset demonstrate that attention mechanisms can improve the ability of the network to identify actions. |