Research On Acoustic Modeling Of Speech Recognition Based On Recurrent Neural Network

Posted on:2020-01-05

Degree:Master

Type:Thesis

Country:China

Candidate:D F Wen

Full Text:PDF

GTID:2428330590471805

Subject:Control Science and Engineering

Abstract/Summary:

Acoustic model is the core module of speech recognition system.With the development of deep learning,a large number of depth models have been applied to acoustic model,which greatly improves the performance of speech recognition.Among them,recurrent neural networks are more suitable for acoustic modeling of speech recognition because they can dynamically model the temporal information of speech.Therefore,research on acoustic modeling of speech recognition based on recurrent neural network has become a hot topic at present.This thesis first introduces the basic principles of speech recognition,then analyzes the structure of various recurrent neural networks and the optimization algorithm of the network,and then focuses on how to optimize the structure of the recurrent neural network and improve the recognition rate of the system.The main contents of this thesis are as follows:1.An end-to-end acoustic modeling method based on recurrent neural network is studied.In order to improve the network convergence speed of light gated recurrent unit(Li-GRU),an improved model(Light Self Gated Recurrent Unit,Li-SGRU)is proposed,which replaces ReLU activation with Swish activation function.Secondly,four variants LiSGRU1,Li-SGRU2,Li-SGRU3 and Li-SGRU4 are proposed to improve the training efficiency of the model.In addition,the validity of the above model for end-to-end modeling is studied in conjunction with connectionist temporal classification technique.The experimental results show that Li-SGRU not only has a fast convergence speed,but also has a better recognition rate than Li-GRU.At the same time,the phone error rate of Li-SGRU1 is 3.1% lower than that of Li-SGRU,and the training time is also reduced by 12.9%.2.The HMM hybrid acoustic modeling method based on recurrent neural network is studied.In order to improve the speech recognition performance of the system,the acoustic model structure and training algorithm of RNN-HMM are stuided,and the validity of the five kind of recurrent neural network structures(LSTM,GRU,Li-GRU,Li-SGRU and LiSGRU1)under the three features(MFCC,FBANK and fMLLR)for HMM modeling is analyzed.Finally,the modeling of the above model was implemented on the Kaldi and PyTorch-Kaldi open source tools.The experimental results show that Li-SGRU1 based on fMLLR feature has the better recognition effect and higher training efficiency.

Keywords/Search Tags:

speech recognition, recurrent neural network, acoustic model, end-to-end

Related items

1	Research On Acoustic Modeling Of Speech Recognition Based On Recurrent Neural Network
2	Research And Implementation Of Speech Recognition Algorithm Based On Recurrent Neural Network
3	Structured Recurrent Neural Network And Its Applications In Automatic Speech Recognition
4	The Study On Acoustic Model Based Neural Netword In Mongolian Speech Recognition System
5	Research On Speech Emotion Recognition Based On Convolutional Recurrent Neural Network
6	Research On Speech Recognition Based On Convolutional Neural Networks
7	Acoustic Model Of Speech Recognition Based On Lightweight Neural Network And Its Application In Robot
8	Recurrent Neural Network Language Model For Continuous Speech Recognition
9	Research On Mandarin Speech Recognition Technology Based On Deep Neural Network
10	Research On Speech Recognition Based On Compound Two-way Cyclic Network Under Specific Working Conditions