Font Size: a A A

Human Skeleton-based Action Recognition Based On Deep Learning

Posted on:2021-01-13Degree:MasterType:Thesis
Country:ChinaCandidate:L LiFull Text:PDF
GTID:2518306452963119Subject:Information and Communication Engineering
Abstract/Summary:PDF Full Text Request
Action recognition has great application value and social value in the fields of video understanding,intelligent monitoring,and human-computer interaction.Human skeleton-based action recognition is an important direction for human action recognition.Compared with action recognition methods based on RGB videos or optical flow data,skeleton information has the advantages of being robust to dynamic background,illumination change,and other factors.Therefore,action recognition based on human skeleton has great research value.Deep learning has made many breakthroughs in the field of computer vision.Using deep learning can more effectively realize human skeleton action recognition.Most current human skeleton action recognition models use convolutional neural networks,recurrent neural networks,and graph convolutional neural networks.The spatial-temporal graph convolutional networks(ST-GCN)is an advanced model for human skeleton action recognition.Aiming at the problem that ST-GCN can only learn local information of a certain neighborhood,this paper uses a non-local attention mechanism to aggregate global information in the human skeleton,and a series of ablation experiments were performed in the time or space dimension by changes the number of attention modules,the position of the attention.Then the AM-STGCN convolutional network based on the attention model was obtained through analysis.The experimental results show that the method can effectively improve the accuracy of skeleton action recognition.Aiming at the problem that the graph structure of the convolution kernel in the ST-GCN network is fixed,it is not applicable to all sample data.This paper first adds non-physical connections to strengthen the connection between the joints of the skeleton,a model named NPL-STGCN is proposed.And then uses higher-order information to design a method that can automatically learn sample information and automatically design different graph structures for different samples,thereby constructing a unique graph convolution kernel for each sample.A model named HOA-STGCN is proposed.Experimental results show that the method can significantly enhance Action recognition performance,and get good robustness and generalization ability.Finally,Aiming at the problem that the low accuracy of a single model,in order to improve the performance of skeletal action recognition models,this paper uses each model to show certain complementarity in the recognition performance of different action categories,utilize The fusion technology fuses the models proposed above to construct a model fusion spatial-temporal graph convolutional networks(MF-STGCN).Experimental results on two large-scale datasets,Kinetics and NTURGB+D,demonstrate that our model achieves significant improvements over previous representative methods,and have strong expressive power and generalization ability.
Keywords/Search Tags:deep learning, action recognition, graph convolution network, attention mechanism, model fusion
PDF Full Text Request
Related items