Font Size: a A A

Research And Application Of Deep Learning Based Continuous Speech Recognition

Posted on:2018-05-30Degree:MasterType:Thesis
Country:ChinaCandidate:C J LiFull Text:PDF
GTID:2428330596954804Subject:Software engineering
Abstract/Summary:PDF Full Text Request
With the continuous improvement of computer performance,speech recognition based on deep learning becomes possible,the modeling method for speech recognition gradually changes from GMM-HMM to DNN-HMM.DNN-HMM model use single DNN to outputs states probabilities instead of GMM.Compared with GMM,DNN has a deeper structure.It can extract high-level features from low-level features.And many researchers show that using DNN-HMM for speech modeling,the error rate is about a third lower than GMM-HMM.In this trend,the thesis focuses on the deep learning and speech recognition,the research are as follows:(1)Designed and implemented CD-DNN-HMM and BLSTM-HMM speech recognition model based on HMM.And the experiment is carried out under the TIMIT speech corpus in order to verify the advantage of BLSTM in sequence modeling.(2)By analyzing the disadvantage of HMM based hybrid model in sequence modeling tasks,the thesis proposed a BLSTM-CTC model for speech recognition.And the experiment proves that BLSTM-CTC performs better than BLSTM-HMM in sequence recognition tasks.(3)Considering that using LSTM as hidden layer unit will bring large amount of calculation,which will cause the decline of system operating efficiency.This thesis proposed to combine GRU with CTC for speech modeling instead of LSTM.The experiment shows that the error rate of both two models is close,but the training time of BGRU-CTC is less than BLSTM by 23%.In addition,BGRU-CTC is improved by using 2 hidden layers with 256 units,and the error rate of 2-BGRU-CTC is reduced to 33%.(4)In order to meet the needs of online oral English learning,an oral score system is designed by using the 2-BGRU-CTC speech recognition model.The system evaluates the recognition result by dynamic programming and can return the wrong pronunciation word.
Keywords/Search Tags:Deep Learning, Hidden Markov Model, Speech Recognition, Connectionist Temporal Classification
PDF Full Text Request
Related items