Font Size: a A A

Research On Hand Pose Estimation Based On Deep Learning

Posted on:2022-10-04Degree:DoctorType:Dissertation
Country:ChinaCandidate:G L WuFull Text:PDF
GTID:1488306314965819Subject:Mechanical and electrical engineering
Abstract/Summary:PDF Full Text Request
In recent years,artificial intelligence and its related fields have gained rapid development,the way people interact with computers is moving toward a more natural and universal direction.As an important part of people's daily activities,hands are an essential part in many applications such as human-computer interaction,virtual reality and robots,can be widely used in entertainment,consumption,smart home,intelligent driving,medical,industrial design and space applications.As a result,hand pose estimation has attracted extensive attention.The purpose of hand pose estimation is to restore the complete motion pose of hand in three-dimensional space,so that the computer or other equipment can perceive the spatial pose of hand and execute according to the instructions of human.However,there are still many problems to be solved in 3D hand pose estimation,such as the low resolution of the available hand,the high degree of freedom of the hand,the susceptibility to environmental influence,the fast changing speed,the occlusion and the similarity of the hand,which hinder the practical application of hand pose estimation.Based on the above reasons,this paper focuses on the problem of hand pose estimation and uses deep learning mechanism to carry out research work from depth data super-resolution,heterologous image registration and fusion,rapid detection of hand region,salient detection and segmentation of hand,hand tracking and hand pose estimation,etc.The main work is as follows:The hand occupies a small proportion in the image,and the data resolution that can be used for model analysis is low.This problem is more pronounced in low resolution depth images.To solve this problem,a depth map super resolution method based on depth feedback network is proposed in this paper.Through iterative upsampling and down-sampling operations,the high-resolution representations are directly projected into the low-resolution space.The designed depth feedback module continuously simulates the process of image degradation and reconstruction,and obtains abundant high-resolution intermediate features which can effectively improve the feature representation at the edge of the depth map and solve the problem of deep edge blurring in the deep super-resolution reconstruction.Secondly,using YOLOv3 as the basic framework,multi-head self-attention module is used to replace the convolutional layer for transfer learning to obtain the outer frame of hand.Then,the residual U-module of U2-NET is replaced by the bilateral attention module with the introduction of complementary attention mechanism.The segmentation is carried out in the two dimensions of foreground and background to obtain the precise segmentation data of hand.Finally,in order to accelerate the efficiency of the network,Siammask is used to quickly obtain the hand area in the continuous frame image,so as to further accelerate the extraction of data.Targeted at the key hand pose estimation problem,this paper proposes a multiresolution and multi-level feature fusion method for hand pose estimation based on point cloud.The overall architecture of the network is composed of three basic building modules,namely,the multi-resolution hand feature encoder,the hand pose decoder and the hand feature reconstruction decoder.The multi-resolution and multi-level hand feature encoder extracts the point cloud features of different resolutions,and fuses the features of different levels.The obtained feature codes are sent to the 3D hand pose decoder for decoding,and the 3D pose estimation of the hand is obtained.The hand feature reconstruction decoder takes 3D pose estimation and feature coding as input,the kernel point cloud is reconstructed with the hand 3D pose estimation,and the input point cloud is reconstructed with the kernel point cloud as the center,which is compared with the original input point cloud,the robustness and accuracy of hand pose estimation can be further improved.The research results of this paper provide a feasible implementation path for 3D hand pose estimation for human-computer interaction,virtual reality,robots and many other applications.The relevant algorithms have been verified by experiments and shown good results,and have a certain practical application prospect.
Keywords/Search Tags:Point cloud, Depth map super-resolution, Hand detection and segmentation, Hand pose estimation
PDF Full Text Request
Related items