Font Size: a A A

Research On Hand Posture Estimation Method Based On Deep Learning

Posted on:2020-11-24Degree:MasterType:Thesis
Country:ChinaCandidate:X H WangFull Text:PDF
GTID:2518306548994099Subject:Control Science and Engineering
Abstract/Summary:PDF Full Text Request
With the development of computer technology,computer vision plays a more and more important role in the field of human-computer interaction(HCI).Accuracy hand gesture estimation can better realize human-computer interaction.The traditional methods achieved accurate estimation of human hand gesture through wearable devices such as data gloves,but they have certain constraints on the users.They are difficult to promote on a large scale and are difficult to implement on some projects.What's more,the equipment costs are high.With the miniaturization and cost reduction of image acquisition equipment,the application prospect of hand gesture estimation based on computer vision is more and more extensive.Based on computer vision,we use deep learning to estimate the hand gesture.The traditional method of 3D hand pose estimation based on deep learning mostly uses depth images.Compared with color images,depth image acquisition equipment is costly and expensive,and has certain constraints on users.Realizing the estimation of the hand gesture based on a single RGB image can reduce the use cost and make the gesture estimation more convenient,but the RGB image lacks depth information,and it is difficult to estimate the 3Dhand gesture.In order to solve the above problems,we proposed a 3D pose estimation framework based on convolutional neural network,which mainly includes three aspects:First,we realizes hand region tracking through improved SegNet,which can also crop hand region from single RGB image.We improved the traditional image segmentation network SegNet.We improved the speed of the network by cutting of useless network.We optimize the network connection mode by using the feature pyramid network.We also proposed a hierarchical attention mechanism to realize the weight distribution of different levels.The position of the hand area is represented by a heat map.The hand area is divided according to the confidence in the heat map.The hand area is cropped by the segmented hand area.The segmented hand area can be used for hand tracking,and the cropped rectangular hand image can be used for subsequent hand gesture estimation.Then,based on the CPM algorithm,we proposed a 2D joint point estimation network based on the variable scale Gaussian kernel-Pose Net,which realizes the 2D joint point estimation based on single RGB image.We used a deep network who can avoid the problem of gradient disappearance in the deep network through multi-layer supervision and continuous optimization after initial estimation.For the problem of hand self-occlusion,we used the method of increasing the receptive field.For the problem that the direct regression joint position is not accurate enough,the task is transformed into the target detection task,and each joint point is regarded as a score map with gaussian map.The task of the neural network is to output the score map corresponding to each joint point.Different gaussian map will have different effects on the score map.This chapter proposes variable-scale gaussian map and determines the appropriate Gaussian noise value to get the best result.Finally,we proposed the CFAM model to complete the 3D hand pose estimation.In the past,the method of estimating the 3D hand gesture based on the single RGB image method only uses the 2D joint point information to restore the 3D gesture.They ignored the hand texture features and hidden spatial information existing in the RGB image,resulting in the hand.There is room for improvement in the accuracy of pose estimation.In order to solve this problem,we proposed CFAM,which combines the features of 2D joint points and RGB images in the channel level.It also replaned the weights of the different feature maps from RGB images and 2D score maps.CFAM cascaded RGB images and 2D score map features,and rationally plans and utilizes each part feature.The introduction of channel attention mechanism improves the effect of different types of feature map fusion.And the method is proved useful by the experiment.
Keywords/Search Tags:deep learning, hand gesture estimation, attention
PDF Full Text Request
Related items