Reasearch Into Speech Recognition Based On Deep Learning

Posted on:2015-03-29

Degree:Master

Type:Thesis

Country:China

Candidate:J Liang

Full Text:PDF

GTID:2298330467462377

Subject:Signal and Information Processing

Abstract/Summary:

PDF Full Text Request

In the era of mobile Internet, speech recognition remains the key to achieve the freedom of human-computer interaction. Meanwhile in the age of big data, deep learning is acquiring researcherâ€™s attention due to its efficiency in information mining. Itâ€™s of great theoretical significance and practical value to do research into speech recognition based on theory of deep learning.Deep learning is essentially a kind of information extraction technology which takes advantage of multi-layer nonlinear transformation. Itâ€™s useful in modeling complex relationships among data through its hierarchical structure. This paper first introduces the basic principles and research status in the area of speech recognition, then elaborated basic theory and the network model of deep learning, and then focuses on how to fully utilize the potential of deep learning theory in speech recognition research.1. Speech feature extraction based on Deep Auto-encoder modelAs known to all, good acoustic characteristics plays an important role in recognition systems. This article concentrates on principle of auto-encoder, and discuss some crucial components such as feature preprocessings network structure and parallel training strategy in depth. Moreover, an deep auto-encoder fed with MFCC feature is built on Matlab platform, which is meant for extracting more robust features from the raw ones. Finally, the evaluation system is constructed with HTK. The experiment shows a1.96%and3.53%improvement in word error rate while using new features with unsupervised and supervised training compared with MFCC features.2. Acoustic Modeling based on Deep Neural Network model Acoustic model is also an indispensable component of the speech recognition system. This paper first analyzes the similarities and difference between the neural network and Gaussian mixture model with respect to the model structure and training methods, then clarifies the feasibility of DNN-HMM model which is used to give a more accurate description of output probability. Both GMM-HMM and DNN-HMM acoustic model are separately trained based on Kaldi platform. With RM corpus as the training data, the experiment shows that with application of DNN-HMM model the systemâ€™s word error rate decreased by30%relatively.

Keywords/Search Tags:

Speech recognition, Deep learning, Feature extraction, Acoustic modeling, DNN, Deep Auto-encoder

PDF Full Text Request

Related items

1	Reasearch Into Speech Recognition Application Based On Deep Learning
2	Research On Speech Recognition Method Based On Deep Learning
3	Research For Continuous Speech Recognition Based On Deep Neural Networks
4	Speech Feature Encoding And Emotion Recognition Based On Auto Encoder
5	Design Of Speech Recognition System For Solitary Words Based On Deep Learning
6	The Research On Children's Speech Acoustic Modeling Based On Deep Learning
7	Research On Speech Recognition Based On Deep Learning
8	The Research Of Speech Recognition Based On Deep Learning In Controller System
9	Smile Recognition Based On Gabor Feature And Deep Auto-encoders
10	Research On Phone Feature Recognition Based On Deep Learning