Font Size: a A A

Research On Human Skeleton Action Recognition Based On Graph Convolutional Networks

Posted on:2024-02-03Degree:MasterType:Thesis
Country:ChinaCandidate:Y P QiFull Text:PDF
GTID:2568307058482174Subject:Master of Electronic Information (Professional Degree)
Abstract/Summary:PDF Full Text Request
Human Action Recognition(HAR)has gained increasing interest in recent years due to the advances in deep learning and computer vision and the applications in human-computer interaction,eldercare and healthcare assistance,and video surveillance.With the availability of depth sensors such as Microsoft Kinect and the improvement of human pose estimation algorithms,skeletonbased HAR has attracted more attention.In recent years,with the emergence of new methods and models in skeleton-based HAR task have been emerging,driving the steady development of this field.Because Graph Convolutional Networks(GCN)can be applied to irregular non-euclidean structure data,more and more methods currently model the human skeleton data as spatialtemporal graphs and apply GCN to extract motion information.However,most existing GCNbased methods usually ignore the diversity of the motion information between channels of the input feature.And how to enhance the ability to capture the long-term global correlations in spatial and temporal dimensions is also a fundamental challenge.In addition,most current methods are too complex to run smoothly on low-performance devices or mobile terminals.Therefore,how to design a light-weight model with high recognition accuracy is an urgent problem to be solved.This thesis conducts research on the above problem,and the main research work and innovations of this thesis are as follows:(1)Aiming at the importance of global motion information in spatial-temporal dimensions,the Multi-Stream Global-Local Motion Fusion Network(GLMFN)is proposed in this thesis.In proposed method,the improved GCN can learn various motion characteristics between different channels of human skeleton data.Meanwhile,the self-attention mechanism is applied to extract the global motion information of spatial and temporal dimensions.GLMFN uses human skeleton data as input features.Specifically,we design a grouping graph convolution module to enforce the ability to aggregate local spatial motion information.Besides,to learn richer semantic features,we propose two modules based on the self-attention operator: a spatial self-attention module and a temporal self-attention module.The former is responsible for extracting spatial long-term motion relationships,while the latter aims to capture temporal long-term motion relationships.Moreover,we present a multi-stream fusion strategy with a series of treatments for body joints to achieve a better recognition effect.To validate the efficacy and efficiency of the proposed model,we perform exhaustive experiments on public datasets of skeleton-based HAR,and our method achieves the state-of-the-art performance on both datasets.(2)Aiming at the problem of high complexity of the model,a light-weight Graph Convolutional Network with Long Time Memory(GCN-LTM)is proposed in this thesis.The Network consists of two network streams: GCN-stream and Recurrent Neural Network-stream(RNN-stream).Specifically,human skeleton data is input into two network streams at the same time.GCN-stream is used to capture the spatial motion relationship in the network,while RNN-stream focuses on extracting long time motion features.In order to better promote the feature learning between the two streams,a contrast learning strategy is introduced.Similarly,the multi-stream fusion strategy containing original and high-order skeleton data is also applied to this method.Experiments on challenging datasets demonstrate that the proposed method can improve the recognition accuracy while reducing the time complexity of network.
Keywords/Search Tags:Human action recognition, Human skeleton data, Graph Convolutional Network, Self-attention mechanism, Recurrent Neural Network
PDF Full Text Request
Related items