With the rapid development of computer vision and artificial intelligence technology,more and more research tasks focus on the direction of human action recognition.The huge potential advantages of using 3D skeletal action data as input to action recognition models have spawned a series of methods using graph convolutional networks(GCNs)to study this task and have achieved great results in action recognition modeling using human skeletons.achieved remarkable results.The key point of the human action recognition method based on graph convolutional network is to construct the topology of the human skeleton sequence graph and use the action features extracted by the graph convolutional network to train the model.layer to update the network weights.However,most graph convolutional network models do not consider the diversity of motion trajectories.In order to solve this problem,this study uses samples with different trajectories to construct a positive sample set of action anchors under the framework of contrastive learning,and uses this set A loss function is constructed to guide the update of the network weights,so that the model can better identify multiple trajectories of the same action.At the same time,a search algorithm is proposed to search the skeleton map topology space to determine the optimal skeleton map topology structure for each layer in GCN,which further improves the recognition accuracy of the model.The work of this paper is as follows:1.In the supervised learning setting,in response to the problem that most GCN-based models do not consider the diversity of motion trajectories,this paper proposes a Supervised Spatio-Temporal Contrastive Learning framework for skeletal action recognition(SSTCL),which constructs target joint nodes(anchor)based on the contrastive learning method,and uses label information to obtain multiple positive samples paired with anchor in each minibatch.Then in view of the characteristics of the model building multiple positive sample pairs,this paper proposes a Multi-Positives Noise Constrastive Estimation loss for Supervised learning(MultiPNCE-Sup).The loss function uses the multi-motion trajectories of skeleton samples to guide the learning of different views,measures the quality of the SSTCL model,and promotes the model to learn more representative features that generalize multiple motion trajectories.2.Current models based on graph convolutional networks usually share a fixed skeleton sequence graph topology in different graph convolutional layers,which cannot highlight the main features of actions.To solve this problem,this paper proposes an optimal Skeleton Subgraph Topology(optSST)algorithm.The algorithm constructs a skeleton map topology space by randomly selecting several joints.For each skeleton map topology structure in the space,optSST calculates the joint correlation strength matrix,and inputs the corresponding hidden layer features to the GCN layer for feature extraction.The obtained features are classified and the graph topology with the highest current classification accuracy is retained.3.Experiments are performed on three large action datasets NTU-RGB+D60,NTURGB+D 120 and Kinetics Skeleton and the experimental results are analyzed.Through ablation experiments,it is verified that the proposed supervised contrastive learning framework and optimal skeleton subgraph topology algorithm based on skeleton graph topology search have a good effect on improving the accuracy of action recognition.At the same time,this paper also compares the experimental results with other state-of-the-art methods,showing that the method proposed in this paper can achieve the best recognition accuracy in some datasets. |