Training GMM-UBM Against SlicedWasserstein Distance For Speaker Recognition

Posted on:2020-11-18

Degree:Master

Type:Thesis

Country:China

Candidate:L Liu

Full Text:PDF

GTID:2428330596992268

Subject:Computer technology

Abstract/Summary:

PDF Full Text Request

Speaker recognition technology,as an important research field of speech signal processing and an important biometric technology,is meanful for many applications such as financial fraud prevention,mobile payment,criminal investigation and identification in public security.Gaussian mixture model-global background model(GMM-UBM)is the most classical model in the research field of speaker recognition.In the GMM-UBM model,UBM is a high-order Gaussian mixture model that covers more speech features,so in the GMM-UBM model,the parameter estimation of the Gaussian mixture model is extremely important.Among them,the expectation maximization(EM)algorithm is the most commonly used parameter estimation method in the Gaussian mixture model.However,the EM algorithm can only guarantee that the likelihood function converges to a local extremum point.From the random starting point,the EM algorithm has a large possibility to converge to a poor local extremum point.Although the K-Means algorithm can alleviate this problem,the effect is limited.Therefore,the EM algorithm can't train to get a better GMM-UBM model,which affects the recognition accuracy.In order to overcome the limitations of the EM algorithm itself,this paper proposes to estimate the parameters of GMM-UBM by optimizing the separated Wasserstein distance.Because the optimized space formed by the separated Wasserstein distance contains fewer local extremums,using the stochastic gradient descent method to optimize the sliced Wasserstein distance,it is easier to get a better GMM-UBM model,and thus improve the recognition rate of the speaker recognition of the model.The traditional EM algorithm and the proposed method are compared.The experimental results show that under different initialization methods,different hybrid numbers and different registration data,the recognition rate of the proposed method has different degrees and obvious improvement.On average,the recognition rate of this method can be improved by about 5% compared with the better case of the traditional EM-GMM-UBM model.

Keywords/Search Tags:

GMM-UBM, Sliced Wasserstein Distance, Speaker Recogniton, Training Parameter

PDF Full Text Request

Related items

1	Research On Image Applications Under Wasserstein Distance
2	Research And Implementation Of Speaker Recognition Methods For Telephone Speec
3	A New Method Of 3D Symmetric Shape Matching Based On Gromov-Wasserstein Distance
4	Target Similarity Measurement Algorithm Based On Wasserstein Distance
5	Research On Image Clustering Based On Unsupervised Deep Learning
6	Conditional Bidirectional Learn And Inference Based On Wasserstein Distance
7	3D Shape Matching Based On Gromov-wasserstein Distance
8	Dimensionality Reduction Technique For Visualization In Wasserstein Space
9	Gromov-Wasserstein Distance Optimization Based On Ant Colony Algorithm
10	Speaker Verification And Person Re-identification Based On Deep Metric Learning