Font Size: a A A

Research On Human Action Recognition Method Based On Deep Learning

Posted on:2021-01-27Degree:MasterType:Thesis
Country:ChinaCandidate:B ZhouFull Text:PDF
GTID:2428330605962376Subject:Control Science and Engineering
Abstract/Summary:PDF Full Text Request
The human action recognition task aims to identify the corresponding action category from the input video containing the target person's activity.The theoretical significance of studying human action recognition is to promote the development of science and technology and society;practical applications include fitness counseling,virtual reality,urban security and many other fields.The state-of-the-art research method in the field of human action recognition adopts the form of RGB and optical flow fusion to extract relevant features from the original video frames.For the state-of-the-art human action recognition methods,there are background information interference,long-term time domain modeling is difficult,and relevant modes are not effectively utilized.This thesis proposes two effective optimization algorithms.The specific research contents and results are as follows:(1)More or less background information is introduced to the state-of-the-art method,which brings a large noise problem to the neural network,and the image background information has interference to feature extraction and a large amount of redundant information existing in the video frame.The problem of unbalanced sample classification and difficult classification of individual classifications proposes a human action recognition algorithm combined with object detection.Firstly,the object detection mechanism is added in the process of human action recognition,so that the neural network has a focus on learning the action information of the human body.Secondly,the video is segmentally and randomly sampled to establish long-term time domain modeling across the entire video segment.Finally,the action of the neural network loss function is improved by the improved neural network loss function.Under the condition of static RGB(only)input,the accuracy of the human action recognition algorithm combined with object detection in the UCF101 dataset and HMDB51 dataset reaches 96%and 75.3%respectively.(2)There is a strong correlation between human action recognition and human posture.Since many publicly available action recognition datasets do not provide relevant attitude data,there are few identifications that train attitude data and fuse with other modalities.method.A multi-stream convolutional neural network is proposed for the current situation of RGB and optical flow fusion using the deep method.Firstly,the pose estimation algorithm is used to generate the key point data of the human body from the static picture containing the human,and the pose is constructed by connecting the key points.Secondly,the space is trained by the RGB image data,and the time domain motion network is trained with the optical flow data.Attitude image data training pose representation network;finally,score fusion is used to obtain action categories.The final experimental results show that the accuracy of the multi-stream convolutional neural network proposed in this thesis is better than that of the unmodified two-stream network in UCF101 dataset and HMDB51 dataset,increased by 2.3%and 3.1%respectively.
Keywords/Search Tags:Deep learning, Action recognition, Convolutional neural network, Computer vision, Object detection, Pose estimation
PDF Full Text Request
Related items