Font Size: a A A

Research Of Deep Learning Neural Networks Applications In Speech Recognition

Posted on:2014-03-20Degree:MasterType:Thesis
Country:ChinaCandidate:S ChenFull Text:PDF
GTID:2268330401458674Subject:Signal and Information Processing
Abstract/Summary:PDF Full Text Request
Traditional speech recognition technologies mainly use the template matching methodand then modern speech recognition technologies use Neural Networks as the main trends.Artificial neural network simulates principle of human neuronal activities, with theself-learning, association, contrast, reasoning and generalization capabilities. It provides anew way to solve such a complex pattern classification problems as speech recognition. Deeplearning has emerged as a new sub-area of machine learning research in recent years, mainlyexplored modeling and learning problems of the multi-layers nodes artificial neural network.This deep neural network can deal with the complex intelligence problems better. Networkmodels mimic the human brain further in information processing and can be better used forspeech recognition.Firstly, this paper introduces the theory and algorithms about the speech acquisition,preprocessing, endpoint detection, feature extraction and time warping network. Duringfeature extraction stage, the practical applied parameters of this paper were Mel frequencycepstral coefficients (MFCC) and the first-order differential of Mel frequency cepstralcoefficients (MFCC), as the following neural network speech recognition system input data.And then we study in speech recognition based on back-propagation algorithm (BP)neural network, and then propose a method of speech recognition based on MFCC andfirst-order differential of MFCC in mixture, which may perform speech feature better. Thenthe BP neural network recognition system also has been optimized to reduce the training timeand improve the recognition performance.The restricted Boltzmann machine (RBM) model in deep learning algorithm is relativelyeasy to learn, algorithm of this model has overcame low efficiency problems of trainingdirectly for multi-layer network. Thus, we use the RBM stacked to build deep belief networkmodel (DBN) for non-specific speech recognition. By using deep neural network, it canadequately describe the correlation between the features, combine the speech characteristicsof consecutive frames together. Because of simulation of multilayer structures for the humanbrains, the net can step by step go on with information feature extraction, finally to form idealhigh-dimensional features for pattern classification, thereby enhancing the recognition effect.In DBN, we use mixture of MFCC and first-order differential of MFCC after timewarping as input data. To enhance the learning outcomes of the model, we optimize thenetwork model in the experiment according to the setting rules of RBM, and for comparisonwith the traditional BP model which was found to achieve a better recognition results.
Keywords/Search Tags:neural network, speech recognition, back-propagation algorithm (BP), restrictedBoltzmann machine (RBM), deep belief network (DBN)
PDF Full Text Request
Related items