Font Size: a A A

Research On Video Human Action Recognition Algorithms Based On Deep Learning

Posted on:2022-02-10Degree:MasterType:Thesis
Country:ChinaCandidate:P H XuFull Text:PDF
GTID:2518306524989809Subject:Master of Engineering
Abstract/Summary:PDF Full Text Request
Human action recognition,as a very important part of video analysis,has played a vital role in many important fields,including video surveillance,human-computer interac-tion,automatic driving,and so on.Traditional human action recognition is mainly based on RGB image or video,but due to the influence of scale,illumination changes and back-ground noise,the effect is not satisfactory.In recent years,thanks to the development of depth sensors and the maturity of the detection algorithm of human keypoint,more and more studies focus on the action recognition algorithm based on skeleton,and begin to use graph convolution to model and analyze skeleton.In this paper,two different im-proved algorithms based on spatio-temporal graph convolutional network are proposed:graph convolutional neural network based on human structure decomposition and graph transformation based two stream graph convolutional neural network.Graph convolutional neural network based on human body structure decomposition:for the problem that the spatio-temporal graph convolutional network can only capture the motion information of the human body,but cannot obtain the motion information of some parts of the body.In this paper,the human skeleton is decomposed to get finer-grained head features,trunk features and leg features,and then the three different features are input into the network respectively to extract deep features.Finally,the final recognition results are obtained through Softmax classifier after the fusion of each deep features.The comparative experiments show that the proposed model can better capture the motion information of a certain part of the body than the spatio-temporal graph convolutional network,and ignore the motion information of unrelated body parts,our proposed method has a great improvement for most types of action labels.Two stream graph convolutional neural network based on graph transformation: for the problem that the adjacency matrix used in the spatio-temporal graph convolutional network remains constant in the training process,which may lead to the failure of the model to capture the connection between the keypoints associated with the movement.In this paper,a spatio-temporal graph convolutional network based on graph transformation is constructed.This network can identify the connection between any two keypoints and enhance the feature expression ability of each keypoint.The graph transformation module can transform the adjacency matrix by itself according to the input data so as to learn the optimal graph structure.In order to make full use of the bone data,a two stream network was designed and constructed to utilize the second-order bone information to improve the model performance.The visualization analysis shows that the proposed graph transfor-mation module can generate a new graph structure,which can capture the relationship between the bone points related to the action category,and proves the effectiveness of the graph transformation module.Moreover,after the addition of second-order bone infor-mation,the performance of the model is improved a lot,which proves the importance of using second-order bone information.
Keywords/Search Tags:convolutional neural network, human action recognition, spatial-temporal graph convolution, graph transformation
PDF Full Text Request
Related items