Skeletal Action Recognition Based On Attention Mechanism Preferences And Local Information Enhancement

Posted on:2023-06-08

Degree:Master

Type:Thesis

Country:China

Candidate:M Q Zhu

Full Text:PDF

GTID:2568307103485254

Subject:Computer Science and Technology

Abstract/Summary:

PDF Full Text Request

With the progress and development of computer vision and artificial intelligence technology in various fields,video intelligent understanding is becoming an indispensable part.Among them,the task of behavior recognition for spatiotemporal analysis of human motion information has become a research hotspot.Obtain human motion data to analyze motion state and motion intention,and accurately identify motion classification.Compared with the traditional way of using videos and pictures as information carriers,action recognition based on human skeleton information has won more attention due to the robustness and generalization ability of skeleton data to light intensity and complex background.Among them,the graph convolutional network,which co-occurs and extracts by modeling the different action information of the human skeleton in the spatial and temporal dimensions as spatiotemporal feature maps,is the most commonly used technical means.It mainly learns long-term interactions through a series of 3D convolutions.connection,but this connection is limited and limited by the size of the convolution kernel.To solve this problem,this paper introduces the self-attention mechanism in Transformer to capture long-range dependencies and obtain global information,and designs a convolutional self-attention module to solve the strong dependence of Transformer on data and the computational cost.question.The main work and contributions are as follows:(1)A skeleton action recognition model based on synergistic graph convolution and Transformer is proposed.By introducing the self-attention mechanism in Transformer to establish long-range dependencies,and combining it with graph convolutional network for action recognition,the model can not only extract local information through graph convolutional network,but also capture rich long-range information through Transformer dependencies.In addition,the Transformer’s self-attention mechanism is calculated at the pixel level,so it has a huge computational cost.The model designs a network stage division strategy to divide the entire network into two stages.The first stage uses pure convolution to extract shallow spatial features,the second stage uses the proposed Conv T block to capture high-level semantic information,reducing the computational complexity.(2)A convolutional self-attention module is designed to replace the linear embedding in the original Transformer,and a multi-scale framework is used to simultaneously model the multi-order data of the skeleton.The original Transformer architecture loses position and order information,adding a fixed position encoding by vectorizing all input sequences,and then using linear embedding to map the data,and this paper designs a convolutional self-attention module to replace the original linear Embedding,using the characteristics of graph convolution to obtain local spatial information enhancement and implicitly obtain position information,which can remove the position encoding,improve the performance of the model and become more lightweight.In addition,this paper uses a multi-scale framework to combine the first-order joint and second-order bone information of skeleton data,and considers multi-scale fusion features to obtain better feature extraction results.In summary,this paper mainly studies the action recognition based on human skeleton,and finds an effective method to enable the model to obtain both local and global information,and reduce the complexity of the model in a lighter way.Finally,experiments are carried out on two classic action recognition datasets,NTU-RGB+D and KineticsSkeleton.The experimental results show that the proposed method is effective and improves the performance of the model.

Keywords/Search Tags:

action recognition, human skeleton, Transformer, graph convolutional network, self-attention mechanism

PDF Full Text Request

Related items

1	Research On Human Skeleton Action Recognition Based On Graph Convolutional Networks
2	Research On Human Skeleton Action Recognition Based On Graph Convolutional Network
3	Research On Skeleton Action Recognition Algorithm Based On Spatiotemporal Attention Mechanism
4	Research On Skeleton-based Action Recognition Via Graph Neural Network
5	Research On Human Action Recognition Method Based On 3D Skeleton Information Feature
6	Research On Human Skeleton Action Recognition Method Based On Graph Attention Mechanism
7	Research On Human Motion Recognition Based On Skeleton 3D Information
8	Multi-stream Slow Fast Graph Convolutional Networks For Skeleton-based Action Recognition
9	Research On Human Action Recognition Based On Deep Learning
10	Research On Human Action Recognition Method Based On Skeleton Features