Font Size: a A A

3D Hand Reconstruction Using A Single RGB Image

Posted on:2020-11-28Degree:MasterType:Thesis
Country:ChinaCandidate:Y H XiaFull Text:PDF
GTID:2428330590961469Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the rapid development of emerging 3D technologies such as virtual reality and augmented reality,it makes the hand become the main carrier of human-computer interaction in 3D scenes,and the 3D reconstruction of the hand is the premise of realizing human-computer interaction.Hand 3D reconstruction refers to the use of the acquired raw data of the hand to reconstruction the posture of the hand in a 3D scene.Most of the existing methods use RGB-D images,that is,images with depth information as the original data of 3D reconstruction.RGB-D images can only be captured by the depth camera,resulting in high cost and low popularity of such methods.In addition,the depth camera has strict requirements on the working environment,data can only be collected in a simple background,which has the disadvantage that the applicability is not strong.In addition,due to the serious self-occlusion problem at the keypoints of the hand,the current 3D hand reconstruction methods are not accurateIn this paper,a double sub-task cascaded model based on deep neural network is proposed for 3D hand reconstruction.The network uses only one RGB image as input to predict the 3D coordinates of the keypoints in the image,and has a good ability to handle self-occlusion problem.Different from the traditional method only used the 2D position information of keypoints as the pre-subtask task,the network structure of this paper consists of four parts.The first two parts extract the features of the input original image and obtain the 2D position information of the keypoints and the hand depth information.The third part uses the attention mechanism to fuse the features,and the two extracted features are merged and used as the input of the fourth part.The fourth part of the network finally obtains the 3D coordinates of the keypoints of the hand,which achieves the goal of 3D hand reconstructionIn addition,since that there are few current publicly available 3D hand datasets,and the tags required by the method proposed in this paper are relatively complicated,in order to train the proposed network,a new data set is constructed and named as HPE.The dataset is synthesized by computer software,with a total of more than 20,000 samples.Each sample contains a RGB image of right hand,22 hand keypoint heatmaps,a hand depth map and 3D coordinates of the 21 keypoints of the handThrough the complete ablation experiment,the three improvement points proposed in the network structure can effectively improve the accuracy of network 3D keypoint detection.In addition,compared with the existing advanced methods on two datasets,the results of the proposed method in the 2D coordinate prediction are better than the current best-performing methods,with the increase of 1.8%and 1%respectively.In the 3D coordinate prediction,the final correct rate of the method in the 20-50mm threshold reaches 99.9%,which is 0.5%higher than the current best method.It can be proved that the 3D hand reconstruction method using a single RGB image proposed in this paper is superior in performance.
Keywords/Search Tags:Deep Learning, 3D Hand Reconstruction, 3D Hand Pose Estimation, Attention Mechanism
PDF Full Text Request
Related items