Research On Speaker Recognition Based On VPT And GMM

Posted on:2015-03-29

Degree:Master

Type:Thesis

Country:China

Candidate:X Q Lu

Full Text:PDF

GTID:2268330431950092

Subject:Pattern Recognition and Intelligent Systems

Abstract/Summary:

PDF Full Text Request

The speech signal conveys many levels of information, such as the concept via words, the language being spoken, the gender and the identity of the speaker. Automatic speaker recognition is a technique that extracts the information in the speech signal conveying speaker identity and makes identification.It belongs to the field of biometric authentication. After decades of development, speaker recognition has widespread application in fields such as Internet access control, telephone banking transaction authentication and judicial security.The commonly-used approaches of speaker recognition can be divided into two categories:one is based on template matching;the other is based on probability and statistic models. Template matching method tries to extract feature vectors from testing speech and calculate its similarity with that from training speech. Template models are simple and easy to calculate, but the recognition accuracy is relatively low. Probabilistic methods use a specific probability density function (pdf) to describe the characteristics of speakers and the log likelihood ratio of feature vectors extracted from test speech with pdf is calculated in the process of recognition. These models are accurate and the recognition rate is high, but they are very complicated so the amount of calculation in training and identification process is very large. With the increase of the target number of speaker recognition system, the time consumed in recognition process increases rapidly, so the recognition speed reduces sharply and cannot meet the real-time need.When it comes to dealing with the shortcomings of these models, this thesis puts forward a two-level speaker recognition model based on VQ-VPT and GMM-UBM, splitting the recognition into two steps. First, a fast search is processed to find out K target speaker voiceprint models that are the most similar to the one to be identified. Then, using the precise GMM-UBM model calculates the likelihood ratio of test feature vectors and makes a final judgment.The fast recognition model is based on VQ-VPT, namely establishing the codebook of all target speakers using LBG algorithm in vector quantization and indexing all code vectors using balanced binary tree VPT. The search time complexity is logarithmic, so it can be used for fast search. GMM-UBM is used as the precise recognition model to guarantee recognition accuracy and a fast scoring approach is used and reduces the amount of calculation furthermore. The two-level model combines the rapidity of template matching method and accuracy of probabilistic method and improves recognition speed with limited performance loss.

Keywords/Search Tags:

speaker recognition, Vector Quantization, Vantage Point Tree, GaussianMixture Model, Universal Background Model

PDF Full Text Request

Related items

1	Research On Support Vector Machine For Speaker Recognition
2	Research On Universal Background Model And Preliminary Study On Deep Learning In Speaker Recognition
3	Research On Robust Speaker Recognition Technology Based On GMM-UBM
4	A Research On Text-independent Speaker Recognition
5	Application Study On Vector Quantization In Speaker Identification
6	Research On Technologies Of Speaker Recognition Based On Sparse Decomposition
7	Any Text Speaker Recognition System
8	Research On Text-Independent Speaker Verification System
9	Research On Speaker Recognition Based On Vector Quantization (VQ)
10	The Study Of Speaker Recognition Based On Vector Quantization