Font Size: a A A

Research On Robust Speaker Recognition Technology Based On GMM-UBM

Posted on:2017-12-11Degree:MasterType:Thesis
Country:ChinaCandidate:D ZhangFull Text:PDF
GTID:2348330503489774Subject:Pattern Recognition and Intelligent Systems
Abstract/Summary:PDF Full Text Request
With the development of information technology and pattern recognition technology,Speaker recognition has become more and more closely related with our daily life.Although many speaker recognition system performance is very good in the laboratory environment,but in the actual environment,their performance is quite difficult to be satisfactory.The research on Robust speaker recognition is an attempt to solve the problem of how to improve the performance of the system under such conditions.This article mainly studied the practical application of GMM-UBM and I-Vector.And aimed at the defects in practical application,some improved methods are proposed relating to feature extraction,model selection,and scoring normalization areas.The robust feature extraction is the key to the success of the speaker recognition system.MFCC has the outstanding advantages of simple calculation and excellent ability to distinguish by using the principle of hearing and the relevant characteristics of the Cepstral.Therefore,MFCC is considered to be one of the most successful feature description in the application of speaker recognition.However,actual environment can increase the non Gauss degree of its distribution.In order to reduce the difference of its distribution between training and recognition.RASTA filtering technique is firstly carried out to reduce the influence of the convolution noise and different channels,then Cepstral Mean Subtraction are used to remove the influence of the linear noise,Feature warping is also used to reconstruct the distribution of MFCC.Finally,the robust MFCC are obtained by combing the first and second differential coefficients.In the scoring phase based on GMM-UBM system,by using TZ-Norm technique for scoring not only improve the performance of the system,but also make it easy to set up an unified threshold.At present, the bottleneck of the development of speaker recognition is how to overcome the channel variation between the training and test speech. Based on I-Vectormodeling method, the Probabilistic Linear Discriminant Analysis(PLDA) is chosen as the channel compensation technique.In addition,Spectral normalization technique for I-Vector space is used before training the PLDA model in order to reduce the difference between the ideal model and the actual model, a simple scoring method is also proposed in the scoring phase to avoid the risk of missing speaker caused by the speech of poor quality.Finally,the experimental results show that the proposed method in this paper is effective.
Keywords/Search Tags:Speaker recognition, Mel-Frequency Cepstral Coefficients, Gaussian Mixture Model-Universal Background Model, I-Vector
PDF Full Text Request
Related items