Font Size: a A A

Research On Dynamic Gesture Recognition Algorithm Based On 3D Convolution

Posted on:2022-03-04Degree:MasterType:Thesis
Country:ChinaCandidate:Q J HuangFull Text:PDF
GTID:2518306488950849Subject:Circuits and Systems
Abstract/Summary:PDF Full Text Request
Human Computer Interaction(HCI)technology has gradually become a popular research problem in recent years.It has natural and comfortable characteristics and has been gradually applied to many scenes in daily life.With the development of computer software and hardware,many technologies that require high computing power have gradually made progress.In the field of deep learning,many vision-based tasks have achieved a lot of results under the satisfaction of computing power and storage resources,such as image classification,target detection,semantic segmentation and other fields.Various fast and efficient algorithms are gradually being proposed,some of which have been applied to actual scenes,such as face detection,mask detection,body temperature detection,etc.Gesture recognition is another important way of communication between people in addition to language communication,and has become an important part of human-computer interaction.In the field of deaf-mute assistance,virtual reality and other fields,gesture recognition technology has a very broad application prospect.Gesture recognition is divided into static gesture recognition and dynamic gesture recognition.Static gesture recognition is to recognize gestures in a static state,while dynamic gesture recognition is to recognize a series of continuous gestures.Compared with static gesture recognition,dynamic gesture recognition has more uncertainty in time and space,and it is more difficult to recognize.In the research of dynamic gesture recognition technology,vision-based research methods are roughly divided into two categories,one is dynamic gesture recognition based on traditional methods,and the other is dynamic gesture recognition based on deep learning.Dynamic gesture recognition based on traditional methods mainly includes four steps:segmentation of gesture actions,tracking of continuous gesture actions,feature extraction of gesture actions,and classification of the extracted features.The dynamic gesture recognition based on deep learning is mainly to design a feature extraction model,and the convergent model weight can be obtained by sending the dynamic gesture data into the algorithm model for training.The dynamic gesture recognition method based on deep learning has the advantages of simple process and higher accuracy.Therefore,this paper adopts the method based on deep learning to study the dynamic gesture recognition technology,and mainly does the following work:This paper proposes a 3D convolutional network model that combines the attention mechanism and residual network.The residual network is used to change the network degradation problem caused by the network being too deep,and the complexity and fitting ability of the model are increased by increasing the network depth.The attention mechanism allows the network to focus on features that have higher weights for recognizing gesture actions.The model achieved 94.82% and 82%accuracy on the Jester dataset and self-built dataset,respectively.In order to study the effect of data under the model under different modalities,this paper separately studied the extraction methods of optical flow modal data and semantic segmentation modal data,and performed optical flow modal data summation on the RGB modal data of the self-built dataset.Extraction of segmentation modal data.In the experimental part,the effects of the three modal data on the network model are tested separately,and the network model is compared and analyzed with the weight initialization trained on Jester dataset and the effect of the model training when the weight initialization is not used.Based on the model trained on the self-built dataset,this paper conducted experiments on the control of various functions of the music player,and the overall recognition rate of the experiment reached 90.8%.
Keywords/Search Tags:Human Computer Interaction, Dynamic Gesture Recognition, Deep Learning, 3D Convolution, Model
PDF Full Text Request
Related items