Font Size: a A A

Study On The Learning Method Of Spatiotemporal Manifold Feature Of Human Action In 3D Motion Space

Posted on:2021-01-27Degree:MasterType:Thesis
Country:ChinaCandidate:Y HuFull Text:PDF
GTID:2518306017472844Subject:Computer technology
Abstract/Summary:PDF Full Text Request
Skeleton data has been widely used in action recognition because they can be stably adapted to dynamic environments and complex backgrounds.In the existing methods,the joint and bone information in the skeleton have been proved to be very helpful for the task of motion recognition.However,how to combine these two types of data to make best use of the relationship between joints and bones is still a problem to be solved.Action recognition(video classification)is a very important direction in the field of video understanding.Deep analysis and processing of the acquired human 3D visual data is a cutting-edge research topic in the field of machine learning and pattern recognition.However,because the human action is non-rigid motion,the optimization and analysis of its related 3D data is strongly nonlinear,and it is difficult for us to directly use traditional methods based Euclidean space to deal with it.As Riemannian manifolds have more advantages in describing 3D motion,we propose a Riemannian manifold trajectory graph convolutional network that can make full use of 3D skeleton data on the basis of studying its theory.Among them,skeleton-based action recognition technology lies in two aspects:on the one hand,how to design strong discriminative skeleton features,on the other hand,how to use temporal-domain correlation to build dynamic changes in action.Due to the characteristics of skeleton movement,how to extract the distinguishing spatiotemporal features and effectively model the spatiotemporal evolution of different actions.For this reason,we have proposed a spatiotemporal attention mechanism model,which integrates the spatiotemporal attention submodule into the end-to-end deep learning architecture.The main contributions of this paper are as follows:(1)We designed a convolutional model of spatiotemporal trajectory graph based on Riemannian manifolds.In order to learn the spatiotemporal features of the pair changes during the movement of the skeleton,the original three-dimensional skeleton coordinates were preprocessed into action sequence curves.Considering each action curve as a graph node in graph convolution,using the Riemannian metric on the Riemannian manifold,we construct the action trajectory curve.The trajectory graph convolution is used to predict the established neighborhood subgraphs.According to the prediction results,the nodes in the constructed subgraphs are labeled with pseudo labels,and the nodes in these graphs are respectively labeled with different categories to classify the nodes.Our model has been experimented on three current action recognition datasets,and the results show that our algorithm can achieve the best results.(2)We designed an end-to-end spatiotemporal attention mechanism model,and proposed an airspace attention submodule to automatically mine joints.Certain types of actions are usually only associated with and characterized by a subset of kinematic joints.First,the SE3 structure is used to express the interaction process between two persons.In addition,for the interaction process of the skeleton parts of each frame in the motion process,we propose a spatiotemporal domain attention submodule,which explicitly learns and assigns content-related attention to the output of each frame.Different attention is paid to the skeleton parts of the interaction process to improve the recognition performance.Then,the skeleton sequence after the filtering process is subjected to Riemannian similarity measurement to construct a neighborhood graph.Finally,the graph convolution subnetwork based on similarity learning updates the edges between all nodes,and then draws the difference based on similarity.The distance between the same type of action nodes is reduced,and the distance between different action classes is enlarged to form clusters of different actions.A graph with global similarity is obtained.It proves that our model can achieve good results on a variety of public interaction action datasets.
Keywords/Search Tags:Attention Mechanism, Trajectory Graph Convolution, Manifold, End-to-end Model
PDF Full Text Request
Related items