Research On Speaker Adaptation Of Neural Network Acoustic Models

Posted on:2021-02-28

Degree:Master

Type:Thesis

Country:China

Candidate:X H Pu

Full Text:PDF

GTID:2428330614965947

Subject:Electronic and communication engineering

Abstract/Summary:

PDF Full Text Request

In recent years,the application of deep neural network(DNN)technology in the field of automatic speech recognition(ASR)has developed rapidly,which has become the mainstream acoustic modeling technology in ASR field.However,due to the difference between the target speaker voice in the production environment and the training data speaker voice in the experimental environment,the problem of model mismatch also occurs in the acoustic model based on the deep neural network.When the speech recognition system recognizes a speaker voice that is not in the training data,the recognition performance of the system will be significantly reduced.This thesis focuses on the influence of speaker adaptive technology on the performance of speech recognition system based on DNN.This thesis mainly studies from the perspective of the combination of speaker adaptive technology and deep neural network modelThis thesis summarizes the research progress of the deep neural network model,and introduces the structure principle of the basic neural network model,the specific steps of model training and the mathematical theory of the key process in detail,and gives the solutions to the common problems.Then,two speech recognition systems based on HMM-DNN model and HMM-LSTM models are constructed.As the basic system of this thesis,the recognition performance of the baseline systems are analyzed by relevant experiments.2.On the basis of HMM-DNN model,this thesis proposes a new speaker adaptive technology and corresponding model structure based on deep neural network.This model uses the ideas of voiceprint recognition and neural network dropout for reference,and adds speaker identification I-vector vector with regular coefficient to the acoustic model of baseline system,which makes the acoustic model adapt to the voice differences brought by different speakers and better recognize the general semantic information.Then,through the relevant experiments,it is proved that the technical scheme can effectively improve the recognition accuracy of ASR system.3.We continue to study the recurrent neural network,the long short term memory network(LSTM)model,which has good performance in RNN,in this thesis.we propose a network model with regular coefficient speaker identification I-vector vector in the LSTM acoustic model.The model can better adapt to the different information of different speakers,improve the generalization ability of the model,and improve the system performance.The experiments show that the two kinds of acoustic models based on the depth neural networkcan improve the recognition rate after introducing the I-vector feature information of the speaker's identity vector,which proves the rationality and effectiveness of the speaker adaptive technology scheme based on the depth neural network acoustic model proposed in this thesis.

Keywords/Search Tags:

Deep Neural Network, Speech recognition, Speaker Adaptation, Acoustic Model

PDF Full Text Request

Related items

1	Research On Speaker Adaptation Of Neural Network Acoustic Models
2	Research On Speaker Adaptation Of Neural Network Acoustic Models For Speech Recognition
3	Study Of Speaker Adaptation Based On Neural Network Acoustic Model
4	Research On Speaker Adaptation Methods Based On RNN-BLSTM Acoustic Model
5	The Research Of Uyghur Acoustic Model Based On Deep Neural Network
6	Research On Acoustic Model Of Speech Recognition In Educational Scene Based On Deep Learning
7	Research On Speaker Adaptation In Speech Recognition
8	Research On Acoustic Modeling For Spontaneous Spoken Speech Recognition
9	The Study On Acoustic Model Based Neural Netword In Mongolian Speech Recognition System
10	Research On Speaker Adaptation Of Deep Neural Network