Font Size: a A A

Research On A New Method Of Speaker Verification

Posted on:2016-08-10Degree:MasterType:Thesis
Country:ChinaCandidate:Q M ZhaoFull Text:PDF
GTID:2308330470967679Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Voiceprint recognition is an important biometric authentication technology, and widely used in a lot of scene. The main methods to do voiceprint identification include SVM, JFA and i-vector and so on, which are based on GMM-UBM. Besides collecting speech corpus to train the target speaker model, GMM-UBM approach also requires a lot of extra speech corpus to do score normalization, which brings the practical barriers to voiceprint recognition technology. The existing endpoint detection methods are generally using short frame unit, and have the problem of meticulous segmentation, which causes the problem of excessive computation. For these two shortcomings, the paper does anew exploration of speaker verification, major contributions are as follows:1. We change this single point method to multi-point test, comparing the differences of the test score of checking speech sequence on the target speaker model and test model, and get better robustness. To the way using the test speech to get a test model, we have three attempts, get the test model by adapt the UBM using test speech, get the test model by adapt the UBM using the mixture of test speech and training speech, get the test model by adapt the target speaker model using test speech. The score deviation of detecting speech on the target speaker model and test model adapted from UBM with test speech is the most obvious. On this basis, we find several possible ways to judge the size of the score deviation, respectively, comparing deviation of scores with ratio threshold method, the mean score for TOP deviation method, the distance score, scoring for sorting results, and model parameters method, in which TOP mean score of deviation method gets the best result, the EER has been lifting 4.2% on the reference methods.2. Compared with the traditional voice activity detection method for dealing with short-time frame units, we present a speech endpoint detection method imitation of human visual perception. First extract speech segments envelope with full text information, and waveform shape features, and then vote after the clustering feature, remove noise and silence segments to get speech segments. We have a good result: VQVAD algorithm is relatively highest 45 percent increase, the system EER decreases 1.7% on average, and we have an advantage that every speech segment contains only a target speaker.3. Based on the good result of before two research, we put forward a new frame to do speaker tracking based on segmentation by wave shape. We first split speech segment by wave shape and the every speech segment is spoken by only one person, then we do speaker verification on all speech segments. This method reduces the speaker segmentation process complexity, speeds up the detection rate. On MASC synthetic datasets, the missing rate and false alarm rate of traditional speaker verification reach 18%, and the missing rate and false alarm rate of new speaker verification also reaches 28%. The new speaker tracking system is a feasible solution.
Keywords/Search Tags:Speaker recognition, GMM-UBM, Voice activity detection, Envelope, Score deviation, Speaker segmentation
PDF Full Text Request
Related items