Font Size: a A A

Research On Robust Speaker Recognition Over Noisy Short Utterance

Posted on:2016-02-25Degree:DoctorType:Dissertation
Country:ChinaCandidate:Y ChenFull Text:PDF
GTID:1318330512971805Subject:Control Science and Engineering
Abstract/Summary:PDF Full Text Request
The automatic speaker recognition technologies have developed into more and more important moden technologies required by many speech-aided applications.In previous work,good results have been achieved for clean high-quality speech with matched training and test acoustic conditions.However,under short utterance and environment noise,often expected in real-world conditions,the performance of speaker recognition systems degrades significantly,far away from the satisfactory level.In order to further improve the practicability of the speaker recognition,robustness becomes a crucial research issue in speaker recognition field.In this paper,mainly research and explorate the robust of noisy short utterance speaker recognition technology.In order to improve the speaker recognition rate of noisy short speech,compensation algorithm in this paper has involved multi feature fusion algorithm,noise separation algorithm,speech frame quality detection algorithm,recognition model optimization and improvement.The major research work in the dissertation includes the following several aspects:1.According to the characteristics of training and test corpus insufficient in noisy short utterance speaker recognition technology,the sound source information and channel information are combined,to remedy the defects of single feature insufficient expressing the speaker's voice characteristics,in case of a serious lack of language information.Noise robustness and the identify ability of a variety of the feature is different,and they can play a complementary role.And then we use differential evolution algorithm to optimize fusion coefficient of single feature in feature group.Experiments show that,under the same conditions,using features group integrated system(MFCC_D_LPCC+ WOWOR4)+(MFCC_D_LPCC+ WOWOR6)+(MFCC_D_LPCC+ WOWOR8)can make the speaker recognition rate of noisy short utterance improve average 13.44%than using single feature MFCC,average 10.53%than the use of feature group of MFCC_D_LPCC.In the different signal-to-noise ratio environment,using differential evolution algorithm to optimize fusion coefficient of single feature in feature group can make the recognition rate of system improve average 1.62%.2.In order to reduce the influence of noise on the performance of speaker recognition effect,noise separation is important.Propose CNMF(Constrained Non-negative Matrix Factorization)algorithm,the algorithm first uses the FastICA noise separation algorithm to separate noisy short utterance,then the separation result is used as the initial value of NMF(Non-negative Matrix Factorization)algorithm,and add the differential limitation in NMF algorithm,in order to effectively separate the noise.Experiments show that,under the same conditions,the speaker recognition rate of using CNMF algorithm improve average 3.75%than using random initialization NMF separation algorithm.3.After using the CNMF algorithm,the speech frame still contain varying degrees residual noise,so need further treatment:use speech frame quality discrimination algorithm divided the speech frames into high quality and low quality,high quality speech frames is directly used for speaker recognition,low quality speech frames processed is used for speaker recognition.The algorithm can significantly reduce the influence of noise,and can make full use of limited speech corpus for speaker recognition,so the algorithm can improve speaker recognition rate of noisy short utterance.In this paper,the three speech frame quality discrimination algorithm are proposed,including ISNRDA(Improved SNR Discrimination Algorithm),DDADA(Differences Detection and Discrimination Dlgorithm),NMF-SNRDA(NMF-SNR Discrimination Algorithm).Experiments show that,under the same conditions,the speaker recognition rate of using ISNRDA improve average 3.26%than no using speech frame quality discrimination algorithm;the speaker recognition rate of using DDADA improve average 1.71%than using ISNRDA;the speaker recognition rate of using NMF-SNRDA improve average 1.74%than using DDADA.4.In order to accurately classify the speech frame,dual information quality discrimination algorithm is proposed.If the result of two speech frame quality discrimination algorithm are the high quality,the speech frames is high quality;if the result of a algorithm is high quality,the result of the other algorithm is low quality,the speech frames is medium quality;if the result of two algorithm are the low quality,the speech frames is low quality.Experiments show that,under the same conditions,the speaker recognition rate of using dual information quality discrimination algorithm improve average 2.32%than using single discriminant algorithm.5.Combine the above three classes of speech frames with the constructed GMM-UBM three stage classification model in this paper,so we can make the limited speech corpus utilized more fully,and the influence of noise and speech corpus insufficient is reduced effectively on the speaker recognition rate of the noisy short utterance.The experimental data show that,combining dual information quality discrimination algorithm with GMM-UBM three stage classification model,the speaker recognition rate improve average 2.4%than combining dual information quality discrimination algorithm with GMM-UBM two stage classification model.
Keywords/Search Tags:speaker recognition, speaker identification, noisy short utterance, multi feature fusion, noise separation, speech frame quality discrimination, dual information quality discrimination, GMM-UBM three stage classification model
PDF Full Text Request
Related items