Based On Vector Quantization And Gaussian Mixture Model For Speaker Recognition Technology

Posted on:2009-12-06

Degree:Master

Type:Thesis

Country:China

Candidate:N Chen

Full Text:PDF

GTID:2208360245982238

Subject:Circuits and Systems

Abstract/Summary:

PDF Full Text Request

Speaker recognition is a branch of biometrics. It has caught the attention of the world by its convenience, economy and accuracy, and it is an inevitable trend of security authentication system. The dissertation performs systemic study on the theories and technologies of small text-independent speaker recognition, including five sections of speech de-noising, end-pointing detection, feature extraction, recognition methods, and system implementations, with some effective improvements, a standard mini-type speech corpus used for speaker recognition was established, the corpus contains 20 persons' speech samples, and it settles the base for algorithm tests. Wavelet de-noising method based on soft threshold operations in different scales destroyed the power spectrum of the voiceless , so it can't maintain integrity of the speech. Against this disadvantage, we proposed a new method named segmental wavelet de-noising. This method can maintain integrity of the speech well; meanwhile it can reserve the advantage of wavelet de-noising method based on soft threshold operations in different scales. The fractal used in end-pointing detection has been researched, and the results show that it has better robustness compared with short-term energy and short-term zero rate, it is a better choice in the case of low signal to noise rate. According to the comparative study of the exist features, some conclusions are obtained, that is, Mel cepstrum coefficients get the obvious advantages when used alone, Mel cepstrum coefficients comparing with its second differential coefficient have better ability to distinguish the speakers. A new feature named Quasi-pitch frequency was proposed. It is based on the frequency spectrum. Compared with pitch frequency it has better anti-noise, anti-long changes ability. With no change of space complexity and no obvious increase in time complexity, rough sets has been introduced into vector quantization because of the property of no precise class in voice. This method can reduce the impact of noise. Based on the difference of the impact of noise in different voice, a new adaptive compensation Gaussian mixture model has been proposed, and it can improve the recognition performance in a certain range of signal to noise rate. In this paper, based on MATLAB, a experimental platform used for the research of text-independent speaker recognition has been developed. The result show that the performance of the improved algorithm is superior to the traditional method. The system combinated of fractal endpoint detection, wavelet denoising, Mel Frequency Cepstrum Coefficient and Vector Quantization based on rough set has good performance. When signal to noise ratio equel to 20dB, the recognition rate can up to be 98.03%.

Keywords/Search Tags:

speker recognition, fractal, rough sets, vector quantization, Gaussian mixture model

PDF Full Text Request

Related items

1	Combining. Rough Sets And Neural Networks For Image Vector Quantization Coding Research
2	A Research On Text-independent Speaker Recognition
3	Research Of Speaker Recognition Base On VQ And GMM Models
4	Research On Text-Independent Speaker Recognition
5	Speech Recognition Access Control Applications
6	Research Of The Pattern Matching Method In Speaker Recognition System
7	Research On Support Vector Machine For Speaker Recognition
8	Based Text-independent Speaker Identification Technology
9	Any Text Speaker Recognition System
10	Research On Speaker Recognition Based On Combination Of Features