Font Size: a A A

Research On Human Skeleton Action Recognition Based On Graph Convolutional Network

Posted on:2024-07-24Degree:MasterType:Thesis
Country:ChinaCandidate:H Y ZhenFull Text:PDF
GTID:2568307076997599Subject:Mechanical (Computer Technology) (Professional Degree)
Abstract/Summary:PDF Full Text Request
Human action recognition is one of the main research tasks in the field of computer vision,which has rich application scenarios and high research value in security surveillance,medical detection and human-computer interaction.The method of using surveillance video data as input requires a large amount of computational resources to process RGB images,and is less robust when the background factors are complex.In recent years,due to the rapid development of depth imaging technology,3D human skeleton data acquisition using Kinect depth sensors,which describe human movements based on the 3D spatial trajectory changes of major joints,is computationally more efficient and has good immunity to background noise.Therefore,human action recognition based on the skeleton is receiving increasingly widespread attention from researchers.Currently,graph convolutional networks can be easily applied to the human skeleton action recognition task for its own graph structure characteristics,and good results have been achieved.In this paper,the skeleton action recognition study is based on the graph convolutional network model to optimise the performance of the model in extracting the spatial and temporal domain features of the skeleton sequence data,and the main work is as follows:(1)An action recognition model combining adaptive graph convolution with multi-scale temporal modelling is proposed.The adaptive graph convolution module extracts spatial domain features,which mainly consists of a data-driven graph and a data correlation graph.The data-driven graph is able to generate new connections that are not available in the original physical connection graph,and the weights of edges are adjusted during the training process to be more personalised for the different information embedded in different network layers;the data correlation graph is used for each action sample by calculating the connection strength of any two nodes to the data correlation graph is adjusted for each action sample by calculating the connection strength of any two joints to form the weights of the edges,thus participating in the adjustment of the topology graph.The multiscale temporal modelling module extracts time-domain features.The module extends the perceptual field in the time dimension using null convolution,effectively aggregates the temporal relationships between adjacent and non-adjacent time steps,and obtains multiscale information in the time dimension while reducing the time-domain information in the local space that is not relevant to the recognition of actions.(2)An action recognition model combining decoupled attention graph convolution and temporal modelling is proposed.The model consists of a decoupled attention graph convolution module,a channel attention module and a multi-scale temporal modelling module.The decoupled attention graph convolution module and the channel attention module extract spatial domain features.The decoupled attention graph convolution module groups channels,with different groups in different layers having trainable attention adjacency matrices,increasing the spatial aggregation expressiveness;the channel attention module adds an attention mechanism to the channel features and learns different weights for different channels during training.This enables the model to focus on and extract channel features that are more relevant to the action to be recognised,removing the influence of redundant features.The multi-scale temporal modelling module is used to extract time-domain features,as described above.
Keywords/Search Tags:Action recognition, Adaptive graph convolution, Multi-scale temporal modeling, Decoupling attention graph convolution, Channel attention
PDF Full Text Request
Related items