Font Size: a A A

3D Hand Pose Estimation Using Depth Images

Posted on:2021-04-20Degree:DoctorType:Dissertation
Country:ChinaCandidate:W G ZhouFull Text:PDF
GTID:1368330614450795Subject:Mechanical and electrical engineering
Abstract/Summary:PDF Full Text Request
3D Hand Pose Estimation is a very popular interdisciplinary research field,involving robotics,virtual reality,computer vision,and human-robot interaction.Vision-based hand pose estimation method has broad application prospects due to its non-contact,low cost,portability and other advantages.Due to the high degree of freedom,similarity of parts,scale,occlusion,trunction and motion blur and image noise caused by sensors,the task of 3D hand pose estimation is still very challenging.With the rapid development of depth sensors,convolutional neural networks,and GPU,it is possible to estimate3 D pose of human hands using depth images effectively and efficiently.In this thesis,taking the depth image-based 3D hand pose estimation as research background,in view of the current problems and difficulties,theoretical and experimental research is carried out around image acquisition hardware platform,pose estimation algorithm and system application has important theoretical and practical significance.Aiming at the problem of motion blur and image noise in hand pose estimation,an image acquisition system for three-dimensional hand pose estimation is designed and implemented,which is used to quickly acquire depth images including hand.Compared with color cameras,depth sensors are inherently superior in acquiring depth data and can provide depth information for 3D hand pose estimation task.In order to cope with the problem of hand imaging blur caused by rapid hand motion,a time-of-flight depth sensor based on a system-on-chip is designed and implemented.The sensor configuration,image acquisition,decoding,storage,calculation,and transmission are all embedded in parallel computing.Completed on the FPGA chip,it can achieve the depth image acquisition speed of 131 frames per second.Compared with the existing depth image acquisition platform,the image acquisition system has the characteristics of compact structure,low power consumption,low noise and high frame rate.Aiming at the problem of high degree of freedom and similarity of parts in hand pose estimation,an HMTNet network that can better utilize the prior information of hand morphology topology is proposed.The range of motion of the joint points of the hand is related to the distance from the joint point to the root,and the movement between them has a strong correlation.The distal joint point on the same finger depends on the proximal joint point.Through the kinematics analysis of the finger of the hand,a topologicalnetwork of hand morphology was designed to simulate this dependence of the finger.Not only does the five-branch tree network correspond to five fingers,but it also simulates the kinematic relationship of different joints of each finger according to the convolution characteristics.In addition,the feature extraction module also stitches low-dimensional and high-dimensional features to extract richer initial features.Experiments show that this method can obtain smaller hand pose estimation error and higher real-time running frame rate.Aiming at the problem of self-occlusion and incompleteness in hand pose estimation,an MVPoint Net network that can better obtain the characteristics of the local neighborhood information of the point cloud is proposed.By adding the center point of the point pair including point coordinates,side vectors,module length,angle and surrounding neighborhood information to obtain richer point pair features.The hand point clouds obtained from different perspectives are sent to the point cloud feature extraction module and the multi-layer perceptron,respectively,to obtain more robust fusion features,and improve the network's robustness to posture changes caused by hand perspective changes.Experiments show that this method has achieved the best results on the three-dimensional object classification data set and the point cloud-based three-dimensional hand pose estimation task data set.Aiming at the object occlusion problem and scale problem caused by the interaction between the hand and the object,an MSRAHand Net network that can automatically assign greater weight to the hand area and ignore the occlusion object and the background is proposed.Iterative cropping algorithm locates the center of the hand area,so that the hand is better segmented from the depth image.In the regression network part,an attention mechanism is added to allow the network to automatically assign greater attention weight to the hand area,and to improve the regression network's ability to obtain joint point coordinates by cutting the hand for the iterative cutting algorithm.The attention mechanism module uses a combination of residual cascade and multi-scale technology to further improve the performance of the pose estimation regression network.Experiments show that this method achieves the best effect on the data set of hand pose estimation during interaction with objects.Finally,on the basis of the image acquisition system and 3D hand pose estimation algorithm designed above,a real-time 3D hand pose estimation system was built,and the experiment of grasping objects by the teleoperation manipulator was completed,whichverified the 3D hand pose estimation system under real scenes.The whole experiment process includes the depth image acquisition of To F depth camera based on So C,highspeed transmission of depth image,real-time 3D coordinate acquisition of hand joint points based on 3D hand pose estimation algorithm,and smart hand object capture based on mapping algorithm.In order to cope with the grasping of objects with different shape features,five-finger,three-finger and two-finger grasping experiments were done,respectively,which can better complete the object grabbing,and verified the practicability of the 3D hand pose estimation system in real scenes.
Keywords/Search Tags:robot, human-robot interaction, depth images, hand pose estimation, occlusion
PDF Full Text Request
Related items