Research And Implementation Of Speech Recognition Algorithm Based On Recurrent Neural Network

Posted on:2021-03-15

Degree:Master

Type:Thesis

Country:China

Candidate:J R Dong

Full Text:PDF

GTID:2428330611967583

Subject:Computer technology

Abstract/Summary:

PDF Full Text Request

With the rapid development and popularization of artificial intelligence,speech recognition has brought more and more convenience to people.Many kinds of intelligent audio,voice assistant,voice input method and other applications can be seen everywhere.People's requirements for speech recognition have gradually improved,mainly reflected in the accuracy and efficiency.The traditional speech recognition method is based on the hidden Markov model,but with the rapid increase of data volume,the processing efficiency based on the hidden Markov model is more and more unable to meet people's needs,people began to apply the deep learning method in the field of speech recognition.Many models only consider the current state judgment when processing the speech signal,ignoring the influence of context relevance on the current output judgment,and consume too much time in processing the input-output alignment,resulting in the recognition accuracy and efficiency is not high enough.In this paper,we think that giving "memory" weight to the relevant information of context will help to judge the current output,and more accurately convert the temporal information of speech into the corresponding correct characters.In view of the problems existing in the field of speech recognition,this paper proposes a hybrid GRU-CTC model which integrates leakyrelu function.There are two main characteristics of the model.Firstly,a variant structure of Gated Recurrent Unit(GRU)based on recurrent neural network is introduced.The context relevance is fully considered through the double gating mechanism,and selective memory is carried out through weight assignment.On this basis,a novel LeakyReLU function is incorporated to improve the training convergence efficiency of the model.Secondly,Connectionist temporal classification(CTC)is introduced,which can directly process the whole sequence input without prior alignment of voice data and text data.It can solve the time-consuming problem of general model in the process of processing input-output alignment and improve the speed of model training.In order to solve above problems,this paper compares two groups of experiments.The first group of experiments is to compare GRU-CTC with the other two groups of training models.The experimental results show that the character error rate(CER)of GRU-CTC hybrid model is the lowest among the three groups of experimental models.Compared with the sub optimal LSTM-CTC,the CER value is reduced by 1.03%,and the accuracy is improved to a certain extent.In the second group,gru-ctc with different activation functions is compared and analyzed under different language models.The experimental results show that the CER of Leaky GRU-CTC under Tri-gram language model is 1.32% lower than that of the suboptimal model,and the training convergence speed is faster and the recognition accuracy is higher.

Keywords/Search Tags:

Deep learning, Speech Recognition, Recurrent neural network, Gated Recurrent Unit, Connectionist temporal classification, LeakyReLU

PDF Full Text Request

Related items

1	Research On Speech Emotion Recognition Algorithm Based On Deep Learning
2	Research On Tibetan Speech Recognition Based On Bidirectional Recurrent Neural Network
3	Chineses Speech Recognition System Based On CLDNN Hybrid Model
4	Research On Optic Inspection Algorithm For Large Infusion Bottles
5	Research On Emotional Tendency Classification Based On Online Video Website Reviews
6	Research On End-to-End Speech Recognition Based On GRU And Self-Attention Mechanism
7	Binaural Speech Separation Research Based On Deep Learning Of Time Series
8	The Application Of Recurrent Neural Network On Image Super Resolution
9	Research On Human Motion Recognition Algorithm Based On Deep Neural Network
10	Research On Named Entity Recognition Of Chinese Image Reports Based On Recurrent Neural Networks