Font Size: a A A

Research On Gesture Recognition Algorithm Based On Deep Learning

Posted on:2022-07-23Degree:MasterType:Thesis
Country:ChinaCandidate:J WangFull Text:PDF
GTID:2518306545990429Subject:Control Science and Engineering
Abstract/Summary:PDF Full Text Request
In the 21st century,we are in the rising period of a new wave of artificial intelligence.It is self-evident that the popularity of artificial intelligence brings us convenience.Human computer interaction technology plays an important role in the field of artificial intelligence.Face,gesture and limbs are the most commonly used information in human-computer interaction based on vision.Gesture interaction is favored by many researchers because of its convenient operation and flexible expression.Because gesture recognition needs high realtime and accuracy in practical application,how to recognize gesture efficiently and accurately becomes the key in human-computer interaction.This paper studies static and dynamic gestures,the main work is as follows:1)Based on the wide application of target detection algorithm in static gesture recognition and the requirement of convolutional neural network for efficient and accurate recognition,an improved static gesture recognition algorithm based on Center Net model is proposed.By setting different number and size of non local modules,the problem that ordinary convolution operations only focus on local information is solved,and the ability of network to obtain global information is improved.The residual network in the model is improved by using deep separable convolution,effectively abating the amount of calculation while optimizing the network structure.In order to solve the inaccurate problem of small batch learning in low display environment,a transformation scheme of group standardization instead of batch standardization is proposed.The experiment shows that the modified Center Net algorithm can effectively improve the generalization ability of the network to gesture model.The recognition rate of gesture is 4.9% higher than that of the original network,the detection time is less than 0.005 s.2)Aiming at the inaccurate key frame extraction in several common sampling methods,a video critical frame extraction method based on optical flow is proposed.The critical frames of each video are extracted by comparing the magnitude of the optical flow,and then the standard dynamic gesture video sequence is obtained.In addition,in order to avoid over fitting in the training process,the data set is expanded.Experiments show that the optical flow method can effectively extract the critical frames with motion information from the video,and has a significant role in improving the accuracy of gesture recognition.3)Using a variety of data fusion methods to improve the accuracy of target recognition has become a common method in human action recognition.However,the results of several popular fusion methods are only the stacking of multiple modal data,and the improvement of gesture recognition rate is not obvious.Therefore,this paper proposes a multi-modal data joint training method based on C3 D network model by combining RGB,depth and optical flow data.Firstly,the training model of RGB data is obtained by using C3 D convolutional neural network,and then the depth and optical flow data are fine tuned on the obtained RGB training model to obtain the training model of their respective modal data.Experiments show that this training method can not only improve the accuracy of gesture recognition,but also accelerate the convergence speed in the process of depth and optical flow data training.
Keywords/Search Tags:Gesture recognition, C3D convolutional neural network, nonlocal module, key frame extraction, multimodal data fusion
PDF Full Text Request
Related items