Font Size: a A A

Research Of Text-Independent Speaker Recognition Based On VQ And GMM

Posted on:2008-10-20Degree:MasterType:Thesis
Country:ChinaCandidate:B S ChenFull Text:PDF
GTID:2178360215490563Subject:Instrument Science and Technology
Abstract/Summary:PDF Full Text Request
As one of the biometrics techniques, speaker recognition is the process of automatically recognizing who is speaking on the basis of individual information included in speech waves. Because of its particularly advantage on convenience, economy and extensibility, this technique can be applied to a number of areas, such as banking by telephone, telephone shopping, database access services, security control for confidential information and remote access to computers. Because of that, it is pay attention to by the researcher more and more in the biometrics techniques realm in recent years.This text introduced the concept of speaker recognition system firstly, then analyzed a few extraction methods of speech feature parameters in common use and a few models of speaker recognition. It studied Vector Quantization (VQ) and Gaussian Mixture Models (GMM) be used for text-independent speaker recognition, and accomplished elementary speaker recognition system on ARM embedded developing board based on S3C2410.For the model of VQ, the selection of codebook is very important to the identifying probability. When codebook is too small, the identifying probability descends much, however, the identifying probability start descending when codebook is over 128, and identifying time doubly increase. There is a good result for choosing 128 codebooks, in consideration of the function of the system. Theoretically, too little codebook make the feature space turn too rough which will increase the probability of mistake accepting, but too much codebook make the feature space turn too thin which will increase the probability of mistake refusing, this both sides will cause whole identifying probability to descend, according with experiment.For the model of GMM, there is a good result for choosing 64 mixtures GMM. When the mixture of GMM too little, identifying probability is very low; this is because the too little Gauss weight fold to add is shortfall to approach feature space of identified object. But when the mixture of GMM is too much, identifying probability is not further exalting and identifying time increases much.The experiment express that, different training time and different testing time, also influence to identifying probability much, more time have more high identifying probability more high, but when train time is over 30 seconds, identifying probability is not further exalting, which explaining that the training time take 30 seconds are enough. When training time is 30 seconds and testing time is 1.5 seconds, the right identifying probability can attain 92.0% by adopting the model of VQ, while by adopting the model of GMM can attain 96.0%.With the embedded technology fast developing nowadays, this research results completely can apply to most situations of need the speech to verify speakers, for example, cellular phone, PDA, speech timecard, building control system etc.
Keywords/Search Tags:Speaker Recognition, Text-Independent, Vector Quantization, Gaussian Mixture Models, Embedded System
PDF Full Text Request
Related items