Font Size: a A A

3D Hand Pose Estimation From Single RGB Images

Posted on:2021-02-28Degree:MasterType:Thesis
Country:ChinaCandidate:J L WangFull Text:PDF
GTID:2518306107960409Subject:Control Science and Engineering
Abstract/Summary:PDF Full Text Request
3D hand pose estimation is a popular research direction in the field of human-computer interaction.It mainly studies how to estimate 3D hand pose from the image,which is of great significance for augmented reality and virtual reality technologies.According to different image forms,the direction includes estimation tasks based on monocular,multi-view and depth images.This paper mainly studies 3D hand pose estimation based on monocular RGB images.The results are represented by the 3D coordinates of the hand keypoints.First,we studied an estimation method based on a two-stage deep network.The method was divided into two stages.The first stage estimated the heatmaps of the 2D hand keypoints from an image.A new encoding-decoding network was designed to get pixel-wise estimation.In the second stage,the 3D coordinates were estimated from the heatmaps of keypoints.Based on an existing method,the hand pose was decomposed into local pose and global pose and a new two-steam network was designed to estimate them separately.The two stages of the method were trained separately and did not affect each other.The network structure was simple and the purpose was clear.Experiments show that the two-stage method proposed in this paper can effectively estimate 3D hand pose.Secondly,we studied an estimation method based on an end-to-end deep network.A novel hand pose representation method was proposed and a new end-to-end deep network was designed in this paper.2D and 3D coordinates of the hand keypoints were outputted and supervised directly.Image information could be used fully in the estimation.The input and output of the method were clear,no additional data processes were needed,and tasks of 2D and 3D estimations were interconnected,interconstrained and interoptimized.Experiments show that the end-to-end method proposed in this paper get the state-of-the-art performance.Finally,we studied an estimation method based on graph convolutional deep network.Graph convolutional networks are neural networks designed for non-grid structure graph targets.The skeleton of hand is a typical non-grid structure.In this paper,the graph target was constructed based on the skeleton of hand and the graph convolutional network suitable for 2D feature map was designed to optimize the estimation of hand keypoints.The method made full use of the skeleton structure of hand and combined the structure of hand with deep network.Experiments show that the graph-convolutional-network method proposed in this paper gets the best performance.
Keywords/Search Tags:3D Hand Pose Estimation, Deep Learning, Pose Estimation, Graph Convolution
PDF Full Text Request
Related items