Research On Spatiotemporal Feature Enhancement For Skeleton-based Human Action Recognition

Posted on:2024-04-05

Degree:Master

Type:Thesis

Country:China

Candidate:R X Qing

Full Text:PDF

GTID:2568307127454164

Subject:Computer Science and Technology

Abstract/Summary:

PDF Full Text Request

Human action recognition is the research focus of computer vision,and is the basis for human motion prediction and action localization.With the development of intelligent machines,medicine and video surveillance,human action recognition technology has important practical application value in the fields of human-computer interaction,monitoring security and video understanding.Because skeleton data has better robustness and lighter weight in the face of complex background scenes or changes in motion angle than RGB video data,human action recognition technology based on skeleton data has been rapidly developed in recent years.With the development of deep learning technology,in view of the good performance of graph convolutional networks in processing non-Euclidean data,more and more scholars have begun to apply graph convolutional networks to explore in the study of human action recognition based on skeleton data.In the human action recognition task based on skeleton data,extracting discriminative spatiotemporal features is the key to recognition.However,the existing skeleton action recognition methods with graph convolutional networks as the baseline network still face problems such as insufficient spatiotemporal dependent feature extraction,insufficient learning of potential relationships between features,and insufficient discriminative features of the limb parts that complete the action.In view of the above problems,this paper mainly studies the aspects of enhancing spatiotemporal dependent features,learning potential feature relationships,and hierarchical reinforcement learning on skeleton data.The main contents and results of this paper are as follows:(1)In this paper,a skeleton action recognition method based on Multi-Granularity SpatioTemporal Encoder(MG-STE)is proposed.First,in the spatial domain,this paper proposes a Multi-Granularity Spatial Encoder(MG-SE)module,which divides the feature vectors containing different joint granularities in the joint dimension,and these features contain all the time information of the action in these joint granularities.Secondly,in the time domain,this paper proposes a Multi-Granularity Temporal Encoder(MG-TE)module,which divides multiple granular features of different continuous time lengths in the time dimension,and these continuous time fragments contain the spatial information of all joints.Then,this paper proposes a Two-stream Multi-Granularity Spatio-Temporal Encoder Graph Convolutional Network(2s-MG-STEGCN)based on Multi-Granularity Spatio-Temporal Encoder,and the final prediction result is obtained by fractionally weighted fusion of individual prediction scores for joint flow and bone flow.Finally,experiments are carried out on NTU-RGBD 60 and Kinetics-Skeleton 400 datasets,and the results verify the effectiveness of the proposed method.(2)In this paper,a human skeleton action recognition method based on Feature Difference and Feature Correlation Learning Mechanism(FDCL-GCN)is proposed.Firstly,the Temporal Feature Difference and Correlation Learning(TFDCL)module is proposed to learn the feature correlation between related parts in adjacent time frames,and the feature differences are captured by the changes in the action of joints on the entire long-term timeline.Secondly,the Channel Feature Difference and Correlation Learning(CFDCL)module is proposed,which uses independent convolution kernels to interact with different channels to obtain more complex feature maps to highlight key joints with high influence in the whole movement.Then,considering that all joints are involved in maintaining motor progression and body balance,the Temporal Channel Context Topology(TCCT)module is proposed to dynamically learn the context topology to enhance global features.Finally,in the experimental stage,experiments are carried out on NTU-RGBD 60 and Kinetics-Skeleton 400 datasets,and the results verify the competitiveness of the proposed method.(3)In this paper,a Hierarchical Learning Strategies and Short-term Motion Enhancement(HLS-SME)method based on hierarchical learning and short-term motor enhancement is proposed.First,Hierarchical Learning Strategies(HLS)is proposed to perform hierarchical learning on skeleton data.At the same time,considering that the training cost of independent streams using multiple modal data is large and the feature information cannot be shared between each modal,unified multimodal data processing is carried out first to share global coordinates.They are then fed into their respective network model pipelines for training,so that knowledge can be shared between different modal data.Secondly,a Short-term Motion Expansion(SME)module is proposed to enhance short-term motion characteristics.Finally,in the experimental stage,the three large public datasets of NTU-RGBD 60,NTU-RGBD 120 and Kinetics-Skeleton 400 are verified,and the results show the competitiveness and effectiveness of the proposed method.In summary,this paper takes the graph convolutional network as the basic framework,carries out research on human action recognition based on skeleton data,proposes three skeleton action recognition methods,and experiments are carried out on multiple public general skeleton datasets,which proves the good performance of the proposed algorithm and proves that the research in this paper has theoretical value and practical application value.

Keywords/Search Tags:

Skeleton action recognition, graph convolutional networks, feature enhancement, spatio-temporal feature learning, hierarchical learning

PDF Full Text Request

Related items

1	Action Recognition Method Based On Multi-frequency Spatio-temporal Feature Learning
2	Research On 3D Skeleton Action Recognition Based On Spatio-Temporal Feature Learning
3	Human Skeleton Action Recognition Based On Deep Learning
4	Research On Human Skeleton Action Recognition Method Based On Graph Convolutional Network
5	Video Action Recognition Based On 2D Convolution Network Under Spatio-Temporal Feature Enhancement Mechanism
6	Research On Human Skeleton Action Recognition Based On Graph Convolutional Networks
7	Research On Spatio-Temporal Feature Based Human Action Recognition
8	Research On Action Recognition Based On Deep Network Learning Of Spatio-temporal Features
9	Research On Human Action Recognition Based On Skeleton Features
10	Research On Skeleton-based Action Recognition Method Guided By Semantic Features