Font Size: a A A

Research On Speaker Adaptation Of Neural Network Acoustic Models

Posted on:2021-02-28Degree:MasterType:Thesis
Country:ChinaCandidate:X H PuFull Text:PDF
GTID:2428330614965947Subject:Electronic and communication engineering
Abstract/Summary:PDF Full Text Request
In recent years,the application of deep neural network(DNN)technology in the field of automatic speech recognition(ASR)has developed rapidly,which has become the mainstream acoustic modeling technology in ASR field.However,due to the difference between the target speaker voice in the production environment and the training data speaker voice in the experimental environment,the problem of model mismatch also occurs in the acoustic model based on the deep neural network.When the speech recognition system recognizes a speaker voice that is not in the training data,the recognition performance of the system will be significantly reduced.This thesis focuses on the influence of speaker adaptive technology on the performance of speech recognition system based on DNN.This thesis mainly studies from the perspective of the combination of speaker adaptive technology and deep neural network modelThis thesis summarizes the research progress of the deep neural network model,and introduces the structure principle of the basic neural network model,the specific steps of model training and the mathematical theory of the key process in detail,and gives the solutions to the common problems.Then,two speech recognition systems based on HMM-DNN model and HMM-LSTM models are constructed.As the basic system of this thesis,the recognition performance of the baseline systems are analyzed by relevant experiments.2.On the basis of HMM-DNN model,this thesis proposes a new speaker adaptive technology and corresponding model structure based on deep neural network.This model uses the ideas of voiceprint recognition and neural network dropout for reference,and adds speaker identification I-vector vector with regular coefficient to the acoustic model of baseline system,which makes the acoustic model adapt to the voice differences brought by different speakers and better recognize the general semantic information.Then,through the relevant experiments,it is proved that the technical scheme can effectively improve the recognition accuracy of ASR system.3.We continue to study the recurrent neural network,the long short term memory network(LSTM)model,which has good performance in RNN,in this thesis.we propose a network model with regular coefficient speaker identification I-vector vector in the LSTM acoustic model.The model can better adapt to the different information of different speakers,improve the generalization ability of the model,and improve the system performance.The experiments show that the two kinds of acoustic models based on the depth neural networkcan improve the recognition rate after introducing the I-vector feature information of the speaker's identity vector,which proves the rationality and effectiveness of the speaker adaptive technology scheme based on the depth neural network acoustic model proposed in this thesis.
Keywords/Search Tags:Deep Neural Network, Speech recognition, Speaker Adaptation, Acoustic Model
PDF Full Text Request
Related items