Font Size: a A A

Research On Human Skeleton Action Recognition Method Based On Graph Convolutional Network

Posted on:2024-01-29Degree:MasterType:Thesis
Country:ChinaCandidate:Z X JiFull Text:PDF
GTID:2568307061472004Subject:Signal and Information Processing
Abstract/Summary:
Human action recognition is one of the research hotspots in the field of deep learning and vision analytics in recent years.Its task is to extract human action features from videos using neural network models and classify the actions based on the extracted features for the purpose of recognition.Action recognition techniques have been widely used in the fields of video surveillance,motion analysis,virtual reality and robotics.Since the skeleton action recognition method only records the position coordinates of human joints,it has the advantages of small data volume,high semantics,no recording of irrelevant information such as background,and robustness of model expression,etc.Moreover,with the development of human posture evaluation technology,human skeleton data can be more easily obtained,and skeleton-based action recognition has received widespread attention.In the classical algorithms,the traditional skeleton action recognition methods directly encode the original skeleton data with fixed topological relationships without deeply investigating whether there are correlations among the original skeleton data nodes.In contrast,graph convolution can obtain the connection between data in non-Euclidean space by specific steps,which has obviously become a hot research topic in human skeleton action recognition based on human skeleton nowadays.In this paper,we study the traditional classical human skeleton action recognition methods and existing human skeleton action recognition methods,draw on the superior modeling ability of graph convolution in non-Euclidean space,and analyze the defects of existing classical graph convolution neural network methods.To address these problems,two improvement schemes are proposed in this topic,which are mainly as follows.(1)An adaptive depth differential graph convolutional network model for action recognition is proposed.Existing methods adaptively learn the topology of the human skeleton through attention or other mechanisms and use the topology for all channels,which forces the graph convolution to aggregate features with the same topology in different channels,limiting the flexibility of feature extraction.To address this problem,an actional recognition network framework with adaptive deep graph convolution is designed.The network takes skeleton sequences as input and explores motion features on different channels by constructing channel-level topologies,which improves the flexibility of model feature extraction and refines and complements the graph convolution with more fine-grained information.Secondly,in order to enrich the diversity of motion features,the depth graph convolution network module for extracting motion features is improved by introducing a differential operation to obtain different node-based motion features,thus enriching the motion features.(2)An action recognition model with spatio-temporal attentional depth-enhanced graph convolution is proposed.A human skeleton action recognition scheme combining spatio-temporal attention mechanism and depth-enhanced model is investigated for the problems of difficulty in capturing global dependencies of joints and the ability to aggregate channel-level motion features.First,a global operation-based spatio-temporal attention module is constructed to enhance the discriminative frames and nodes in the skeleton sequence,making it easier for the model to acquire key motion information;second,a multi-scale spatio-temporal convolution is constructed by considering the influence of human body prior knowledge at different time periods and using temporal convolution with different field of view;finally,the embedded depth enhancement module acquires global contextual and local channel-level information and uses this information as a complementary enhancement to the depth differential graph convolution.The method uses the depth difference graph convolution model as the basic framework and constructs a multi-stream fusion model with joint information,skeletal information,and respective motion information.
Keywords/Search Tags:Action Recognition, Deep Convolution, Spatio-temporal Features, Spatio-temporal Attention Mechanism, Depth Enhancement
Related items