Font Size: a A A

Based On Vector Quantization And Gaussian Mixture Model For Speaker Recognition Technology

Posted on:2009-12-06Degree:MasterType:Thesis
Country:ChinaCandidate:N ChenFull Text:PDF
GTID:2208360245982238Subject:Circuits and Systems
Abstract/Summary:PDF Full Text Request
Speaker recognition is a branch of biometrics. It has caught the attention of the world by its convenience, economy and accuracy, and it is an inevitable trend of security authentication system. The dissertation performs systemic study on the theories and technologies of small text-independent speaker recognition, including five sections of speech de-noising, end-pointing detection, feature extraction, recognition methods, and system implementations, with some effective improvements, a standard mini-type speech corpus used for speaker recognition was established, the corpus contains 20 persons' speech samples, and it settles the base for algorithm tests. Wavelet de-noising method based on soft threshold operations in different scales destroyed the power spectrum of the voiceless , so it can't maintain integrity of the speech. Against this disadvantage, we proposed a new method named segmental wavelet de-noising. This method can maintain integrity of the speech well; meanwhile it can reserve the advantage of wavelet de-noising method based on soft threshold operations in different scales. The fractal used in end-pointing detection has been researched, and the results show that it has better robustness compared with short-term energy and short-term zero rate, it is a better choice in the case of low signal to noise rate. According to the comparative study of the exist features, some conclusions are obtained, that is, Mel cepstrum coefficients get the obvious advantages when used alone, Mel cepstrum coefficients comparing with its second differential coefficient have better ability to distinguish the speakers. A new feature named Quasi-pitch frequency was proposed. It is based on the frequency spectrum. Compared with pitch frequency it has better anti-noise, anti-long changes ability. With no change of space complexity and no obvious increase in time complexity, rough sets has been introduced into vector quantization because of the property of no precise class in voice. This method can reduce the impact of noise. Based on the difference of the impact of noise in different voice, a new adaptive compensation Gaussian mixture model has been proposed, and it can improve the recognition performance in a certain range of signal to noise rate. In this paper, based on MATLAB, a experimental platform used for the research of text-independent speaker recognition has been developed. The result show that the performance of the improved algorithm is superior to the traditional method. The system combinated of fractal endpoint detection, wavelet denoising, Mel Frequency Cepstrum Coefficient and Vector Quantization based on rough set has good performance. When signal to noise ratio equel to 20dB, the recognition rate can up to be 98.03%.
Keywords/Search Tags:speker recognition, fractal, rough sets, vector quantization, Gaussian mixture model
PDF Full Text Request
Related items