Font Size: a A A

Research On Gesture Interaction Algorithm Based On Keypoint Detection And Instance Segmentation

Posted on:2021-06-15Degree:MasterType:Thesis
Country:ChinaCandidate:J ZhaoFull Text:PDF
GTID:2518306515472774Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Gesture provides a natural input mechanism for human-computer interaction,which is an important interactive means.In recent years,due to the development of hardware and artificial intelligence technology,people have put forward various data glove prototypes and methods based on computer vision to estimate gestures accurately and quickly.Especially with the advent of depth camera in 2010,gesture estimation has attracted great attention in the field of computer vision.For application,gesture is expected to realize human-computer interaction based on non-touch gestures,and has great application potential in virtual reality,augmented reality,gesture recognition,sign language system,interactive games,motion recognition and other fields.Therefore,3D Hand Pose Estimation is an important task in the field of computer vision.Although hand pose estimation has rich applications,it is still a challenging task,for example the multi-degreeof-freedom and occlusion of gestures.The contents are mainly divided into three parts,namely gesture estimation,gesture segmentation and gesture interaction.Gesture estimation network DB-InterNet can obtain gesture joint coordinates through estimating 3D gestures by 2D images;Gesture segmentation network studies the case-based segmentation algorithm and migrates the case segmentation algorithm to gesture segmentation,trains the gesture segmentation model and outputs a finer pixel-level gesture segmentation image;Finally,in augmented reality environment,the results of gesture estimation and gesture segmentation are used to complete simple virtual-real occlusion,and the occlusion relationship is presented on the user screen.The main contributions of hand pose estimation are as follows:(1)3D gesture estimation algorithm: aiming at the problem of low accuracy of coordinate position of key points in InterNet gesture estimation network,this paper proposes DB-InterNet network based on confidence coordinate function and activation function to improve the accuracy of gesture estimation.(2)confidence coordinate function: because the soft-argmax function is not accurate enough to find the maximum position coordinate,an inhibition factor ? is added to the confidence coordinate function to improve the probability of the maximum value,and different inhibition factors ? are selected according to different key points to be suitable for two-handed gesture estimation,so that the maximum position coordinate is more accurate and the key point prediction is more accurate.(3)dynamic activation function:because the static activation function does not learn the input features comprehensively,the dynamic activation function is used to dynamically learn features at different positions of the network to enhance the learning of joints in order to make the network learn more comprehensive gesture context information.It makes different layers adjust the segmented activation function dynamically according to the input,and improves the representation ability of the model.At the same time,this function also plays a certain role in the learning of hidden joints.The main contributions of gesture segmentation are as follows:(1)gesture segmentation algorithm: aiming at the problem of low accuracy of case segmentation network Mask R-CNN in segmenting masks with complex shape features,this paper proposes a case segmentation algorithm based on path enhancement and fusion feature pyramid network to improve the quality of mask segmentation.(2)feature enhancement:aiming at the adhesion of the segmentation mask caused by the slender finger,the path enhancement and feature fusion are carried out on the feature pyramid network for detecting targets of different scales,so that the semantic features with high layers are fused with the position information of the bottom layer and transmitted back to the network,so that all subsequent layers have rich semantics,thus achieving the purpose of feature enhancement.(3)normalization: in order to further improve the accuracy of the model,grouping normalization is introduced as a simple substitute for BN in gesture segmentation algorithm,which ensures that all features are in a similar range,makes the gradient descent algorithm easier to find the minimum value directly,and accelerates the iterative convergence speed of the learning algorithm.In the aspect of gesture interaction,this paper uses gestures to perform simple interaction with virtual objects,judges the position relationship between the hand and the virtual object through the joint coordinates estimated by gestures,fills the occlusion layer according to the occlusion sequence and renders the occlusion relationship,integrates the contents of each layer into a whole and displays the results on the desktop display.As a result,the users can see gestures and virtual content superimposed on the real world.
Keywords/Search Tags:Gesture pose estimation, Instance segmentation, Gesture segmentation, InterNet, Augmented Reality
PDF Full Text Request
Related items