Font Size: a A A

Research On Speaker Adaptation Methods Based On RNN-BLSTM Acoustic Model

Posted on:2018-09-26Degree:MasterType:Thesis
Country:ChinaCandidate:Z Y HuangFull Text:PDF
GTID:2348330512485624Subject:Information and Communication Engineering
Abstract/Summary:PDF Full Text Request
Speaker adaptation techniques aim to optimize the performance of speech recog-nition system for the target speaker with the adaptation data of the target speaker.It can be realized by either transforming a pre-trained speaker independent(SI)speech recognition system to speaker dependent(SD)speech recognition system to match the SD acoustic features or modifying the SD acoustic features to SI acoustic features to match the pre-trained SI speech recognition system.Actually,speaker adaptation tech-niques are used to solve the mismatch problem between speaker and speech recognition system.Recurrent neural network with bidirectional Long Short-Term Memory(RNN-BLSTM)acoustic model is powerful at temporal modeling,and it makes use of some controllers to solve the vanishing gradient and exploding gradient problems in sim-ple RNN acoustic model.Meanwhile,the recognition performance of RNN-BLSTM acoustic model based system can achieve more than 10%relative reduction in word er-ror rate(WER)compared to the Deep Neural Networks(DNN)in some standard speech database.However,RNN-BLSTM acoustic model cannot solve the mismatch problem above-mentioned.As a result,research on speaker adaptation methods based on RNN-BLSTM acoustic model is very important.This dissertation aims to the research on speaker adaptation methods based on RNN-BLSTM acoustic model.First of all,the speaker code based adaptation method is applied to the RNN-BLSTM acoustic model,and we analyze the different influences in the recognition performance for different controllers of RNN-BLSTM memory cell.Meanwhile,some heuristic methods are proposed to optimize the traditional speaker code based method,and further improve the recognition performance.Then,the deep code(d-code)based offline adaptation method is proposed,which provides a way to solve the re-decoding problem of speaker code based adaptation.Experimental results show that the d-code based offline adaptation method achieves similar performance to the speaker code based method.Moreover,this method shows better performance than the identity vector(i-vector)based adaptation method which also does not need the process of re-decoding,and the training process is more flexible.Finally,this disser-tation research on the d-code based online adaptation method.The method does not need to collect the whole sentence before adaptation.It conducts speaker adaptation gradually during the process of online speech recognition and shows good recognition performance.
Keywords/Search Tags:speech recognition, speaker adaptation, BLSTM, RNN, speaker code, deep code, identity vector, online speaker adaptation
PDF Full Text Request
Related items