Font Size: a A A

Research And Implementation Of Video Action Recognition Based On Deep Learning

Posted on:2021-01-09Degree:MasterType:Thesis
Country:ChinaCandidate:Y P LiFull Text:PDF
GTID:2428330611968834Subject:Electronic and communication engineering
Abstract/Summary:PDF Full Text Request
Human action recognition has a very wide application prospect in many scenes.With deep learning becoming an indispensable core technology in the development of artificial intelligence,it is of great significance and value to study and apply the method of deep learning to human action recognition in videos.Convolutional neural network(CNN),as a representative algorithm of deep learning,has been successfully applied in the field of computer vision,such as image recognition and object detection.Video has a time dimension and diversity of data format compared to image,how to make use of CNN to effectively learn the temporal information of human action in video is still worthy of in-depth research and discussion.For this reason,CNN is used to study and improve the video action recognition algorithm in this thesis,and corresponding schemes are proposed and implemented for the action recognition of various video data formats.At the same time,in order to explore and promote the action recognition technology to practical application,an action recognition system in video surveillance scene is implemented to automatically recognize the human action in the video.The main contents and innovations include:1.For depth video,first of all,the spatial structure dynamic depth map technology is adopted to compress the video into a two-dimensional space for action representation,and then considering that the representation has a low level of abstraction,in order to perform more fully feature learning and abstract of the representation,a CNN with joint supervision is designed for improving the ability of in-class aggregation and inter-class separation of the network.At the same time,it is also considering that the training process is easy to overfit in the case of small samples when designing the network.2.For RGB video,a network based on spatial-temporal attention and adaptive fusion strategy is proposed.The spatial-temporal attention mechanism has the characteristics of simplicity and efficiency.It acts on the stacked feature level of multiple video image frames for weighting spatial-temporal features using correlation coefficient.The adaptive fusion strategy introduces a three-dimensional convolutional neural network to adaptively fuse high-level features,which can effectively improve model performance and generalization capability,and the strategy works on high-level features with small input scale that greatly improves the computational efficiency of the model.3.For the practical application of video action recognition,the model proposed in this paper is used as a core of the system,and online video action recognition is realized based on the improved time sliding window method.Then,a specific quantitative analysis is performed for the performance and accuracy of the system.The accuracy rate at a certain time overlap rate is used as the evaluation criteria of the system recognition accuracy rate,and the time required for video processing per minute is used as the evaluation criteria of system operation efficiency.This thesis makes an exploration for the practical application of action recognition.
Keywords/Search Tags:deep learning, action recognition, convolutional neural network, intelligent video surveillance system
PDF Full Text Request
Related items