Font Size: a A A

Research On Video Action Recognition Method Based On Deep Learning

Posted on:2020-10-23Degree:MasterType:Thesis
Country:ChinaCandidate:L Q WanFull Text:PDF
GTID:2428330578457178Subject:Software engineering
Abstract/Summary:PDF Full Text Request
The rapid growth of massive video data brings huge challenges to mine important and interesting information.How to efficiently analyze and process the massive video data to obtain a small part of valuable information has become the focus of industry and academia.The structure of video data is complex,and the amount of data is huge.The traditional manual annotation methods can no longer meet the growing demand of the number of video.Therefore,it is urgent to learn the automatic classification technology of video by learning the video characteristics.Video action recognition technology is challenged by occlusion,dynamic background change,camera jitter,angle of view and illumination change during feature extraction.Video classification algorithm can automatically analyze the semantic information contained in video,understand its content,and automatically label,classify and describe video,achieving the same accuracy as human.Therefore,large-scale video classification is the next key problem to be solved after image classification.Among them,the action recognition method is the focus of this paper.Based on the deep learning model,this paper extracts the temporal and spatial information in video,and proposes two action recognition methods around the implementation of efficient action recognition tasks.To solve the problem of most the current mainstream methods focus on 3D network and take RGB and optical flow image as network input,resulting in high cost and time consuming,this paper proposes a low-rank 3D action recognition method based on object detection only using RGB image as input.Firstly,this paper proposes a video frame preprocessing method based on the object detection algorithm,which avoids the object loss and the influence of messy background caused by the general cropping strategy in the input video frame of the network.In this method,the object detection algorithm is adopted to detect the video image,accurately locate the target position,and then cropping.The cropping area which is used as the input of the network to retain the object action information to a great extent.Secondly,this paper constructs 2D temporal segmentation and low-rank pseudo 3d combined network structure to extract and classify video features,and effectively improves network operation efficiency and recognition accuracy by designing a variety of low-rank kernel structures.On the basis of the first method,considering that the random selection of video frames as network input may cause the insufficiency of action information contained in video frames,this paper proposes a low-rank 3d action recognition method based on reinforcement learning.In this method,a deep reinforcement learning framework is constructed to select key information frames as network input to ensure the adequacy of action information of input frames.Then,low-rank 3d network based on object detection is adopted for action recognition and classification,and obtain the final classification result,improve the accuracy of network recognition.In order to verify the effectiveness of the motion recognition method proposed in this paper,multiple comparison and verification experiments were carried out on the public database UCF101.The experimental results show that the two methods proposed in this paper can effectively improve the running speed of network under the condition of small improvement of recognition accuracy.
Keywords/Search Tags:Video classification, Action recognition, Object detection, Low rank, Reinforcement learning
PDF Full Text Request
Related items