Font Size: a A A

Research And Implementation Of End-to-end Speech Recognition System Based On CTC Method

Posted on:2020-11-08Degree:MasterType:Thesis
Country:ChinaCandidate:Y LuFull Text:PDF
GTID:2438330572487381Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Automatic speech recognition technology is the key technology to make people and people,machines and machines communicate more smoothly.With the gradual popularization of new social media,the amount of data on the Internet has increased significantly,which has greatly reduced the recognition efficiency of traditional speech recognition systems.In the traditional speech recognition method,in addition to the specific text,it is necessary to label the phonemes corresponding to the chronological order when training the corpus of the model,which requires a large amount of labor costs.Therefore,speech recognition can be made simple using neural network technology.The probability of multiple tag sequences is calculated by Connectionist Temporal Classification(CTC),which is a collection of all possible corresponding words in a speech sample.Since the audio sequence is directly used to correspond to the text,even the language model can be omitted,thus eliminating the standard language model and acoustic model,which will make the speech recognition technology independent of the language,as long as the sample is enough,it can be trained.This paper focuses on the end-to-end speech recognition system based on the connection time classification method.The main research contents include:1)In-depth study of the LSTM structure,improve the network structure of LSTMP,and propose a Re-dimension method,which allows the network to learn historical information autonomously,and through experimental verification,the accuracy of speech recognition is improved.2)Since the Batch Normalization(BN)algorithm used to be used on the DNN model,the BN algorithm is used to make it work on the LSTM network.3)When performing neural network training,the Target Delay method is used to realize the adaptive CTC algorithm,so that the Context of the unidirectional LSTM model is accurately modeled.In summary,the experiment is carried out on the collected data set.The experimental results show that the end-to-end speech recognition based on CTC method can improve the recognition efficiency.With the increasing amount of data,it will surpass the performance of traditional speech recognition system.
Keywords/Search Tags:Speech Recognition, End-to-end, CTC, Batch Normalization, Target Delay
PDF Full Text Request
Related items