Font Size: a A A

The Research Of RGB 3D Gesture Tracking Algorithm Based On GCN

Posted on:2023-07-07Degree:MasterType:Thesis
Country:ChinaCandidate:Q LiFull Text:PDF
GTID:2568306845459604Subject:Electronic Information (Computer Technology) (Professional Degree)
Abstract/Summary:PDF Full Text Request
As an attractive field in computer vision,dynamic gesture tracking is the foundation of emerging 3D gesture interactive applications such as virtual reality and augmented reality.More and more people hope to construct the natural expression between the virtual world and the real world through gestures.In the past,the screen controlled by gestures in science fiction films is gradually becoming a reality.The technology of segmentation,detection,estimation and recognition of user gestures to realize operations such as clicking,dragging,rotating,etc.,as well as gesture interaction with devices by using spatial awareness characteristics is becoming increasingly mature.Gesture tracking and gesture recognition are the prerequisites for realizing gesture interaction.For the past few years,graph convolutional neural network has been gradually applied to the field of hand gesture estimation and gesture recognition,due to its effective representation ability of learning graph structure data.Based on that this thesis is divided into three parts.The static hand posture estimation model based on RGB images,namely U-shaped adaptive graph convolutional network,U-AGCN,is studied.Moreover,the dynamic gesture recognition model based on skeleton is studied,that is,two-stream spatial-temporal fusion graph convolutional network,2s-STFGCN.The dynamic hand skeleton dataset VR-DHG based on different granularity is collected and made.Firstly,facing the increasing complexity of application scenarios of human-computer interaction,the tracking system requires higher and higher precision of gesture estimation.Aiming at the problem of depth blur of RGB images and high self-occlusion in 3D gesture estimation,this thesis uses adaptive graph convolution combined with U-Net structure to predict the positions of 3D hand joints.By mining the spatial information in the graph structure,the problems of depth ambiguity and finger similarity are alleviated.In order to robustly track hands in real time,this thesis combines channel,spatial-temporal and motion attention into an Action module.It is embedded into the key points detection network,extracting and retaining a large number of feature information to the greatest extent,so as to reduce the visible shaking in real-time frames.Secondly,compared with human actions,there are only slight changes between different hand actions.And the range of motion of hand joints is smaller than that of other body joints.Therefore,it is necessary to further explore the local spatial-temporal information and global dependencies in the process of action execution.To overcome such drawbacks,the second-order information of bones to enrich the details of joints is introduced,and kinematic constraints are applied.The local spatial-temporal information is fused in the one graph to capture the detailed spatial-temporal relationship.Furthermore,we combine gated dilated convolution to pay more attention to the correlation of long sequences.Thirdly,gesture interaction data are characterized by a high within-cluster variance.In order to better simulate the actions in virtual interactive applications and further study the characteristics of fine-grained gesture data,this thesis has gathered a dynamic gesture skeleton dataset VR-DHG.This dataset fully distinguishes between coarse-grained and fine-grained actions,and provides guarantee for studying weak changes of gestures.Finally,the U-AGCN model extracts more feature information on the public datasets STB and FPHA,which improves the estimation accuracy of high self-occlusion gestures and alleviates the video shaking problem in real-time applications.The 2s-STFGCN model has achieved good recognition effect on the public dataset DHG-14/28,and it also shows great advantage in fine-grained gesture recognition on the self-made dataset VR-DHG.
Keywords/Search Tags:Hand posture estimation, Gesture recognition, Adaptive graph convolution, Spatial-temporal fusion, Virtual Reality
PDF Full Text Request
Related items