Font Size: a A A

The Study On Acoustic Model Based Neural Netword In Mongolian Speech Recognition System

Posted on:2018-02-28Degree:MasterType:Thesis
Country:ChinaCandidate:H W ZhangFull Text:PDF
GTID:2348330515452366Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Speech recognition is an important means to achieve human-computer interaction.And its goal is to make people communicate with machines by using speech.In recent years,speech recognition technology has undergone tremendous changes.The main methods of acoustic modeling used Gaussian Mixture Model(GMM)are successfully replaced by deep neural networks.Different neural network structures,such as Deep Neural Network(DNN)?Convolutional Neural Network(CNN)and Long Short-Term Memory(LSTM)Neural Network,has been widely studied on the acoustic model.However,these studies are mainly focus on languages like English and Chinese,which are used by a wide range of people.For the minority language,the research of speech recognition technology is still in the initial stage.Mongolian is one of them.In this dissertation,DNN,TDNN,CNN,LSTM and FSMN are applied to the modeling of Mongolian acoustic model,of which influence on Mongolian speech recognition performance is studied by comparing the acoustic models based on different neural networks.In order to further enhance the performance of Mongolian acoustic models,this paper optimized the Mongolian acoustic models by using two methods,which is discriminative training of neural networks and adding speaker features.Experimental results show the acoustic model based on LSTM has the best performance in Mongolian speech recognition system.Due to the complex structure,the LSTM structure has larger computation than other structures.Compared with other structures,FSMN has best equalization performance.And the performance of the acoustic model is obviously improved by using discriminative training and speaker features.
Keywords/Search Tags:speech recognition, neural network, acoustic model, Mongolian, discriminative training, speaker feature
PDF Full Text Request
Related items