Font Size: a A A

Speaker Verification Based On Limited Speech Data

Posted on:2020-02-29Degree:MasterType:Thesis
Country:ChinaCandidate:Z WangFull Text:PDF
GTID:2428330620959967Subject:Control Science and Engineering
Abstract/Summary:PDF Full Text Request
Speech data contains a lot of information.Speech can be used to authenticate the identity of the speaker.Speaker verification is a technology that uses the personality information of the speaker contained in the speech to determine whether a speech is from a specific target speaker.From the classical Gaussian mixture model-universal background model(GMM-UBM)model to the popular i-vector and probabilistic linear discriminant analysis(PLDA)model framework,speaker verification system has made great progress.However,in order to achieve good performance,the existing speaker verification methods require many training data to be acquired in advance for system hyper-parameter training.At the same time,the length of voice data used for speaker enrollment and testing is also required.For short test utterances,the performance of the system decreases rapidly.Aiming at the above problems,this paper proposes solutions for short test utterances,limited training data in target domain and GMM model parameter adaptation.The main work and innovations of this paper are as follows:1)The source of the uncertainty in short utterance i-vector estimation is analyzed,and the calculation of Baum-Welch statistics in the extraction of i-vector is improved.The weighted historical test information as well as the parameter information in the background model is used to increase the speaker's personality information for the calculation of the Baum-Welch statistics,and the improved statistics are used to extract the i-vectors.2)The traditional linear discriminant analysis(LDA)technology is modified and applied to the compensation of domain mismatch between development dataset and evaluation dataset.A small amount of training data in target domain is used to adapt the system parameters trained in non-target domain,which effectively reduces the requirement of training data in target domain.3)The test speech data is used to update GMM speaker model by unsupervised method,and the model updating method based on cross-model similarity measuring is proposed.In order to control and optimize the process of data accumulation,two models are established for each speaker and updated alternately at certain intervals.More data can be used for model training.
Keywords/Search Tags:speaker verification, Gaussian mixture model, i-vector, short utterance, domain adaptation
PDF Full Text Request
Related items