Font Size: a A A

Research On Acoustic Modeling Of Speech Recognition Based On Recurrent Neural Network

Posted on:2020-01-05Degree:MasterType:Thesis
Country:ChinaCandidate:D F WenFull Text:PDF
GTID:2428330590471805Subject:Control Science and Engineering
Abstract/Summary:PDF Full Text Request
Acoustic model is the core module of speech recognition system.With the development of deep learning,a large number of depth models have been applied to acoustic model,which greatly improves the performance of speech recognition.Among them,recurrent neural networks are more suitable for acoustic modeling of speech recognition because they can dynamically model the temporal information of speech.Therefore,research on acoustic modeling of speech recognition based on recurrent neural network has become a hot topic at present.This thesis first introduces the basic principles of speech recognition,then analyzes the structure of various recurrent neural networks and the optimization algorithm of the network,and then focuses on how to optimize the structure of the recurrent neural network and improve the recognition rate of the system.The main contents of this thesis are as follows:1.An end-to-end acoustic modeling method based on recurrent neural network is studied.In order to improve the network convergence speed of light gated recurrent unit(Li-GRU),an improved model(Light Self Gated Recurrent Unit,Li-SGRU)is proposed,which replaces ReLU activation with Swish activation function.Secondly,four variants LiSGRU1,Li-SGRU2,Li-SGRU3 and Li-SGRU4 are proposed to improve the training efficiency of the model.In addition,the validity of the above model for end-to-end modeling is studied in conjunction with connectionist temporal classification technique.The experimental results show that Li-SGRU not only has a fast convergence speed,but also has a better recognition rate than Li-GRU.At the same time,the phone error rate of Li-SGRU1 is 3.1% lower than that of Li-SGRU,and the training time is also reduced by 12.9%.2.The HMM hybrid acoustic modeling method based on recurrent neural network is studied.In order to improve the speech recognition performance of the system,the acoustic model structure and training algorithm of RNN-HMM are stuided,and the validity of the five kind of recurrent neural network structures(LSTM,GRU,Li-GRU,Li-SGRU and LiSGRU1)under the three features(MFCC,FBANK and fMLLR)for HMM modeling is analyzed.Finally,the modeling of the above model was implemented on the Kaldi and PyTorch-Kaldi open source tools.The experimental results show that Li-SGRU1 based on fMLLR feature has the better recognition effect and higher training efficiency.
Keywords/Search Tags:speech recognition, recurrent neural network, acoustic model, end-to-end
PDF Full Text Request
Related items