Font Size: a A A

Design And Implementtation Of Dynamic Gesture Recognition System Based On Recurrent Neural Network Model

Posted on:2022-03-04Degree:MasterType:Thesis
Country:ChinaCandidate:H E SunFull Text:PDF
GTID:2518306572959769Subject:Computer technology
Abstract/Summary:PDF Full Text Request
Gesture is a kind of human body language,which not only contains very rich information,but also is convenient and natural when communicating between people.Therefore,gesture recognition is gradually applied in the field of humancomputer interaction.Gesture recognition is divided into static gesture recognition and dynamic gesture recognition according to the classification of gestures.Since dynamic gestures can reflect the continuous changes of gestures and express more complex gesture information,dynamic gesture recognition has a broader application space.This paper studies the dynamic gesture recognition method based on computer vision and deep learning.Considering factors such as recognition speed,recognition accuracy,and graphics card specifications of the application platform,two dynamic gesture recognition models are proposed:Light-GestureNet and GestureNet.Light-GestureNet extracts global spatiotemporal features on the basis of local spatiotemporal features of image sequences representing dynamic gestures.GestureNet extracts depth features on the basis of extracting local spatiotemporal features of image sequences while stitching and fusing global spatiotemporal features.(1)A lightweight dynamic gesture recognition model Light-GestureNet based on Skip-Res3D and ConvLSTM is proposed.Light-GestureNet is suitable for application platforms with a video memory greater than 6.5GB and has a wide range of applications.Skip-Res3D refers to the cross-layer skip connection ideas of Highway Networks,ResNet,and DenseNet to introduce skip connections between all adjacent residual blocks in Res3D for channel information fusion.Light-GestureNet uses Skip-Res3D to extract the local spatio-temporal features of the image sequence while simplifying the feature set and speeding up the calculation of ConvLSTM.Then,the feature sequence extracted by Skip-Res3D is input to ConvLSTM to extract the global spatio-temporal features.Therefore,Light-GestureNet can accelerate the model fitting speed while extracting the spatio-temporal features of the image sequence,reduce the loss of feature information caused by network deepening,and integrate more channel information.(2)Propose GestrueNet,a dynamic gesture recognition model based on SkipRes3D and DepthNet.GestrueNet is suitable for application platforms with more than 11GB of video memory,and has a high recognition rate of dynamic gestures.DepthNet includes deep separable convolutional neural network branches and ConvLSTM branches.The deep separable convolutional neural network branches are for deep feature extraction while reducing the amount of calculation and the possibility of overfitting.The ConvLSTM branch is for extracting global spatiotemporal features.GestrueNet uses Skip-Res3D to extract the local spatiotemporal features of the image sequence,and then inputs the feature sequence extracted by Skip-Res3D into DepthNet to extract the depth feature information while stitching and fusing the global spatio-temporal features.Therefore,GestureNet can effectively extract the spatiotemporal features of dynamic gestures under the premise of reducing the amount of parameters.In order to verify the feasibility and effectiveness of the model proposed in this paper,the model training and testing experiments were carried out on the Jester dataset.The recognition accuracy of Light-GestureNet was 92.55%,and the average recognition time of a single gesture was 7.57 milliseconds,occupying 6.5GB Video memory,the input is 16 pictures of 56×56 pixels;GestureNet's recognition accuracy rate is 95.64%,and the average recognition time of a single gesture is 14.27 milliseconds,which occupies 10.9GB of video memory.The input is 16 pictures of 112×112 pixels,and pass Confusion matrix analysis,dimensionality reduction visualization analysis,and saliency analysis prove the effectiveness of Light-GestureNet and GestureNet.Finally,based on these two models,a dynamic gesture recognition system based on the client-server style that can recognize 7 dynamic gestures is built,and the control of Power Point is used as an application scenario to verify the effect of the model.The recognition accuracy rate of the GestureNet-based dynamic gesture recognition system is 97.57%,and the average system delay of a single gesture is 452 milliseconds;the recognition accuracy rate of the LightGestureNet-based dynamic gesture recognition system is 94.56%,and the average system delay of a single gesture is 223 There are two reasons why the recognition delay is lower than the dynamic gesture recognition system based on GestureNet:1.Light-GestureNet is faster than GestureNet in recognizing dynamic gestures.2.Because of the large number of GestureNet parameters,the pictures of the picture sequence input to GestureNet are larger to prevent over-fitting,so the data transmission from the client to the server is slow.Experimental results show that Light-GestureNet is a lightweight dynamic gesture recognition model with fast recognition speed and low video memory usage.GestureNet is a dynamic gesture recognition model with higher accuracy but slower recognition speed and higher video memory usage.
Keywords/Search Tags:Gesture Recognition, Convolutional Neural Network, Convolutional LSTM, Deep learning
PDF Full Text Request
Related items