Font Size: a A A

Deep Learning For Robust Speech Recognition

Posted on:2017-01-31Degree:MasterType:Thesis
Country:ChinaCandidate:Y B YouFull Text:PDF
GTID:2428330590491537Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Since 2006,deep learning proposed by Hinton has draw much attention in both industry and academics.It has strong capacity to learn the nonlinear relationship inside data.Recently,deep neural network has achieved great success in automatic speech recognition.In a DNN-HMM hybrid system,DNN is used to replace GMM and provide posterior probability estimates for the HMM states in output layer.Many groups have demonstrated that DNN-HMM hybrid system can achieve relatively 20%-30%WER reduction when compared with traditional GMM-HMM.But the performances in noisy environment has not yet arrived at a desired level.It remains a challenging problem when the large vocabulary continuous speech recognition is used to real word applications.Performance degradation,in most cases,is caused by the mismatch in training and testing conditions.In this paper,we exploit DNN based robust speech recognition on noise condition.Firstly,we investigate the noise-aware training method,it assumes DNN has the capacity to learn the nonlinear relationship between clean signal and noise corrupted signal and augments each observation input to the network with a estimate of the noise present in the signal.Then,DNN is used as a regression model to transform distorted feature to clean feature.If we add dynamic features into the target to be estimated,the enhanced feature may be more stable and invariant to the environmental variability.When combined with noise-aware training and annealed dropout,a absolutely 2.1%improvement can be achieved when compared with DNN-HMM baseline system on Aurora4 database.Finally,a novel multi-task joint-learning framework is proposed to address the noise robustness for speech recognition.The architecture integrates two different DNNs,including the regressive denoising DNN and the discriminative recognition DNN,into a complete multi-task structure and all the parameters can be optimized in a real joint-learning mode just from the beginning in model training.After the noise-aware training approach is applied to this framework,it can achieve absolutely2.7% WER reduction.
Keywords/Search Tags:Deep neural network, Speech recognition, Robust, Feature enhancement, Multi-task, Noise-aware training
PDF Full Text Request
Related items