Font Size: a A A

Research Of Gesture Recognition Based On Computer Vision

Posted on:2022-08-30Degree:MasterType:Thesis
Country:ChinaCandidate:Y B LuoFull Text:PDF
GTID:2518306338967229Subject:Electronics and Communications Engineering
Abstract/Summary:PDF Full Text Request
With the rapid development of electronic technology,computers and intelligent equipments have entered into thousands of households,automatic driving and intelligent control techniques close to public life.In intelligent interaction applications,gesture recognition is the most natural human-computer interaction logic.In recent years,with the development of computer vision technology,gesture recognition method which based on simple camera equipment and separated from data glove or other cumber-some equipments has become the mainstream application.In particular,deep learning technology has shown strong data processing ability in the field of image recognition,and gesture recognition technology based on deep learning has become a research hotspot.Compared with two-dimensional images,dynamic gesture has a higher amount of data and it is a more challenging classification task.The more cumbersome manual feature extraction method has greater difficulty and higher application cost in dynamic gesture video.Based on deep learning,this thesis studies the extraction of spatial and temporal features in dynamic gesture video and its effective fusion methods.In order to extract spatio-temporal features,this thesis proposes a multi-scale temporal features extraction scheme based on dilated convolution theory and fuses the short-term spatiotemporal features of the hand gesture through a Convolutional Gated Recurrent Unit.Futhur in this thesis,to improve ConvGRU,a variant structure of the ConvGRU is proposed by compressing the spatial features,which reduces the amount of learning parameters and improves the ability of the temporal feature's fusion.This thesis further studies the RGB-D two data modalities gesture recognition scheme.Aiming at the hand detection and video key frame extraction process,this thesis proposes a data processing method which is gesture spatiotemporal attention mech-anism based on video input,improves the attention values of the model to key position of the hand in the video.At the same time,the transfer learning technique is applied to the RGB-D two data modalities,improves the accuracy of the single-modality and dual-modality gesture recognition.To verify the effectiveness of proposed methods,several groups of comparative experiments and analysis are carried out on two large public datasets,Jester and ChaLearn LAP IsoGD.The results show that the methods proposed in this thesis solve some effective problems in spatio-temporal feature extraction and its fusion methods,get a high performance in hand positioning,single-modality and dual-modality classification.And high recognition accuracy is obtained.
Keywords/Search Tags:gesture recognition, spatiotemporal features, attention mechanism, multimodal fusion
PDF Full Text Request
Related items