Research And Application Of Deep Learning Based Continuous Speech Recognition

Posted on:2018-05-30

Degree:Master

Type:Thesis

Country:China

Candidate:C J Li

Full Text:PDF

GTID:2428330596954804

Subject:Software engineering

Abstract/Summary:

PDF Full Text Request

With the continuous improvement of computer performance,speech recognition based on deep learning becomes possible,the modeling method for speech recognition gradually changes from GMM-HMM to DNN-HMM.DNN-HMM model use single DNN to outputs states probabilities instead of GMM.Compared with GMM,DNN has a deeper structure.It can extract high-level features from low-level features.And many researchers show that using DNN-HMM for speech modeling,the error rate is about a third lower than GMM-HMM.In this trend,the thesis focuses on the deep learning and speech recognition,the research are as follows:(1)Designed and implemented CD-DNN-HMM and BLSTM-HMM speech recognition model based on HMM.And the experiment is carried out under the TIMIT speech corpus in order to verify the advantage of BLSTM in sequence modeling.(2)By analyzing the disadvantage of HMM based hybrid model in sequence modeling tasks,the thesis proposed a BLSTM-CTC model for speech recognition.And the experiment proves that BLSTM-CTC performs better than BLSTM-HMM in sequence recognition tasks.(3)Considering that using LSTM as hidden layer unit will bring large amount of calculation,which will cause the decline of system operating efficiency.This thesis proposed to combine GRU with CTC for speech modeling instead of LSTM.The experiment shows that the error rate of both two models is close,but the training time of BGRU-CTC is less than BLSTM by 23%.In addition,BGRU-CTC is improved by using 2 hidden layers with 256 units,and the error rate of 2-BGRU-CTC is reduced to 33%.(4)In order to meet the needs of online oral English learning,an oral score system is designed by using the 2-BGRU-CTC speech recognition model.The system evaluates the recognition result by dynamic programming and can return the wrong pronunciation word.

Keywords/Search Tags:

Deep Learning, Hidden Markov Model, Speech Recognition, Connectionist Temporal Classification

PDF Full Text Request

Related items

1	Research On Connectionist Temporal Classification In Speech Recognition
2	Asr Research Based On CTC
3	Design Of End-to-end Ando Tibetan Speech Recognition System Based On Deep Learning
4	Research On Speech Emotion Recognition Algorithm Based On Deep Learning
5	Research On CTC-based And Attention-based End-to-end Speech Recognition
6	Study On Attention Based Speech Emotion Recognition
7	Research On Tibetan Lhasa Dialect Speech Recognition Based On Deep Learning
8	Research And Implementation Of Speech Recognition Algorithm Based On Recurrent Neural Network
9	Research And Implementation Of End-to-End Long-term Speech Recognition Model Base On RNN-Transducer
10	Chineses Speech Recognition System Based On CLDNN Hybrid Model