Research On Dynamic Gesture Recognition Method Based On Three-dimensional Deep Neural Network

Posted on:2019-12-08

Degree:Master

Type:Thesis

Country:China

Candidate:X Xu

Full Text:PDF

GTID:2428330572451751

Subject:Engineering

Abstract/Summary:

PDF Full Text Request

With the rapid development of science and technology,more and more technologies have been applied to people's daily life.Therefore,it has become a consistent goal of today's academic and industrial circles to make people live a more comfortable and concise life through science and technology.In recent years,the popularity of artificial intelligence has set off an upsurge of intelligent life.And human-computer interaction,as a way of communication between human and machine is more essential in intelligent life.Gesture recognition which is a simple and natural way of interaction,has attracted more attention.People expect to make human-computer interaction easier and more natural and close to human habits through gesture recognition.Therefore,in order to improve the accuracy of dynamic gesture recognition,the following works are done in this thesis.(1)Aiming at the problem of preserving the frame images containing motion information in the gesture video as much as possible,a “key frame” extraction method is proposed.First of all,the convolution neural network requires unified input,therefore,it is necessary to unify the frame number of the gesture videos.Based on the statistical analysis of the dataset,the reference frame number of the network's input is determined.Secondly,in the process of video sampling,in order to keep the "key frame" which contains abundant motion information and remove the frames with less information,a weighted average sampling method based on optical flow is used.The video is sampled according to the average optical flow value of each segment in proportion in the original video,because the value of optical flow can represent the intensity of motion.Finally,a gesture dataset with a unified frame number and rich motion information is obtained.(2)Aiming at the temporal characteristics of dynamic gestures and the degradation problem encountered in deep network,a 3D convolution neural network modified by residual idea is used to extract gesture features.In dynamic gesture recognition,a three dimensional convolution neural network is needed to extract temporal and spatial features of gesture simultaneously.Further,in order to learn more abstract features of gesture,the thesis uses a Res C3 D network which combines the residual idea and the three-dimensional convolution neural network to extract the features of the RGB,depth and optical flow data respectively.(3)Aiming at the problem that single data cannot express all information of gesture,a feature fusion strategy based on canonical correlation analysis is proposed.In gesture recognition,various kinds of data need to be fused in order to obtain more information.The thesis first analyzes the fusion strategies of video,features,and decision-making,and selects the feature fusion strategy according to the actual situation.Secondly,for feature fusion,the thesis also analyzes the advantages and disadvantages of mean fusion and cascade fusion.According to the two aspects of the recognition effect and training time,a canonical correlation analysis fusion method is used.It fused three features of RGB,depth,and optical flow from the correlation of various modal features,and obtain a comprehensive feature with abundant information.The fusion strategy lays the foundation for subsequent classification and recognition.In order to verify the effectiveness of the proposed algorithm,the thesis uses the official dataset of the Cha Learn LAP Large-scale Isolated Gesture Recognition Challenge,namely Iso GD dataset to carry out experiments and test the accuracy.First,the comparative experiments are conducted on the innovation points mentioned above and analyzed separately,which proved the effectiveness and necessity of the improvements.Then the final result of the algorithm is compared with other state-of-the art algorithms which use the same dataset.It proves that the proposed method is superior to other approaches.

Keywords/Search Tags:

dynamic gesture recognition, key frame extraction, three-dimensional convolution neural network, canonical correlation analysis, feature fusion

PDF Full Text Request

Related items

1	Discriminant Feature Extraction Algorithms Based On Canonical Correlation Analysis
2	Research On Feature Extraction And Image Recognition Based On Correlation Projection Analysis
3	Feature Extracion And Application Based On Canonical Correlation Analysis
4	Feature Extraction And Fusion Based On Multi-view Correlation Projection Analysis
5	Face Recognition Based On Canonical Correlation Analysis
6	Research On Two-dimensional Feature Extraction Methods
7	Gesture Recognition Based On Multi-modal Fusion Of RGB-D Images
8	Research On Chinese Pule Sign Language Gesture Recognition
9	Application Of Feature Extraction And Convolutional Neural Networks In Gesture Recognition
10	Locality Preserving The Canonical Correlation Analysis And Its Application To Face Recognition