Font Size: a A A

Research On Graph Convolution Neural Network Based On Human Action Recognition Method

Posted on:2023-11-30Degree:MasterType:Thesis
Country:ChinaCandidate:J X ZhangFull Text:PDF
GTID:2568306794457784Subject:Electronic and communication engineering
Abstract/Summary:PDF Full Text Request
Action recognition is an important and challenging field of computer vision problems,such as human-robot interaction,video surveillance,safe driving provides the technical support.Skeleton-based action recognition has attracted more and more attention because of its strong robustness to background,illumination and perspective changes,richer information and lower computational cost.Traditional action recognition methods based on skeleton data pay more attention to the skeleton sequence encoding,but ignore the topological relationship between skeleton data.Graph convolutional neural network’s advantage is that it can use adjacency matrix to simulate the topological relationship between non-euclidean data,which has become a research hotspot for skeleton-based human action recognition.Based on the advantages of graph convolutional neural network in simulating non-euclidean data,this paper studies the skeleton-based action recognition,and improves the existing problems of graph convolutional neural network.In this paper,two improvement schemes are proposed,and based on which a action recognition system based on human skeleton is designed.The main work is as follows:(1)To solve the problem that skeleton-based action recognition can not extract spatiotemporal features sufficiently and it is difficult to capture joint global dependencies,a skeletonbased human action recognition scheme based on spatio-temporal attention mechanism and adaptive graph convolution network is studied.Firstly,a spatio-temporal attention module based on non-local operation is constructed to assist the model to focus on the most discriminative frames and regions in the skeleton sequence;secondly,an adaptive graph convolution network is constructed by using the feature learning ability of gaussian embedding function and lightweight convolution neural network,and considering the effect of human prior knowledge in different time periods;finally,the adaptive graph convolution network is used as the basic framework,the spatio-temporal attention module is embedded to construct two-stream fusion model with joint information,bone information and their respective motion information.(2)Aiming at the problem of how to integrate spatio-temporal information from skeletonbased action recognition method,two different spatial and temporal information extraction schemes with different emphasis were studied: shift graph convolutional network and residual graph convolutional network.Firstly,the principle of shift convolutional network is introduced.Secondly,based on the characteristic that the shift operation in the time dimension can realize the interaction of information in the time dimension,the shift operation replaces the convolution for the temporal dimension,realizes the combination of the shift convolution network and the spatio-temporal graph convolution network,and constructs a more lightweight shift graph convolution network.Finally,the shift graph convolution network is taken as a branch of the residual network and embedded into the spatio-temporal graph convolution neural network to construct the residual graph convolution network,focusing on further enhancing the ability of the network to extract spatio-temporal information without losing the original spatio-temporal information.(3)In order to apply the skeleton-based action recognition algorithm to the actual scene,a small action recognition system is designed and implemented.Firstly,according to the analysis of the system requirements,two data acquisition schemes,camera acquisition and video upload,are used;Secondly,Open Pose human pose estimation technology is used to realize the conversion from video sequence to skeleton sequence,and the action recognition algorithm is encapsulated to realize the prediction of action;Thirdly,build a user management system to record user information and video information to further enhance the user’s sense of experience;Finally,the intermediate process of action recognition and the final prediction results are visualized,and the visualization structure is deployed to the Web page combined with the Flask framework to complete the construction of the whole system.
Keywords/Search Tags:action recognition, human skeleton, graph convolution neural network, spatiotemporal information
PDF Full Text Request
Related items