A Research On Speaker Recognition Algorithm And Speaker Identification System Implementation

Posted on:2011-05-07

Degree:Master

Type:Thesis

Country:China

Candidate:S Q Yang

Full Text:PDF

GTID:2178360305977863

Subject:Computer application technology

Abstract/Summary:

PDF Full Text Request

Speaker recognition is the most natural biometric identification technique. It can be divided into two sub-fields: speaker identification and speaker verification. Speaker recognition automatically identifies the speaker according to the characteristics embodied in the speaker's speech signals, the key issues are the choice of characteristic parameters and recognition modeling. At present, linear predictive coding (LPC) parameters, LPC cepstrum (LPCC) and mel-frequency cepstral coefficients (MFCC) and so on are often used as feature parameters in speaker recognition, and the popular recognition model are dynamic time warping (DTW), vector quantization (VQ) and hidden Markov model (HMM) etc.LPCC represents the physiological differences of speaker's vocal tract, and MFCC utilizes the non-linear frequency characteristics of auditory system, the speech perception characteristics of human auditory. Hilbert-Huang Transform (HHT) is proposed at 1998, due to its strong adaptive time-variance processing capability for non-steady and non-linear signals, HHT rapidly receives wide attentions and get many successful applications in signal processing field. HHT is also the newly measure of speech signal processing. Each of above speech or speaker features: LPCC, MFCC or HHT, has its own advantage, though solely being applied is far to enough to describe the speaker's discriminative characteristics. Each of these features may contain semantic information and also speaker characteristics, integral utilization of theses diverse features may be the best way to construct a reliable speaker recognition system.Basing upon above analysis, in the experimentation of speaker recognition, LPCC, MFCC and HHT are respectively supplied to the speaker recognition system, and then the combined features of MFCC and HHT are used. In experiments of this thesis, Matlab is the development environment. LPCC, MFCC, HHT, and combined features are extracted or formulated from speech signal, then supplied to several popular models: Dynamic Time Warping (DTW), Hidden Markov Model (HMM), Gaussian Mixture Model (GMM). In addition, the Gaussian components in GMM are also tested for comparing recognition performance.The results showed that, for speaker identification, HHT features have better recognition rate than LPCC and MFCC, and with combined features, GMM is favorable to DTW or DHMM, and combined features is superior to any non-combined features: LPCC, MFCC or HHT feature. It has been shown that HHT feature can be used as new parameters in speaker recognition, if it is combined with MFCC feature to formulate combined ones; the combined features may simultaneously contain MFCC dynamic time characteristics and HHT high frequency resolution capability. The combined features may improve the system performance. GMM may be the best recognition model in speaker identification system.

Keywords/Search Tags:

Speaker Identification, Hidden Markov Model (HMM), Hilbert-Huang Transform (HHT), Mel-Frequency Ceptral Coefficients (MFCC)

PDF Full Text Request

Related items

1	Research Of Small Vocabulary, Speaker-independent Chinese Keyword Spotting Algorithm
2	The Research Of Extracting Pathological Voice's Characteristics Based On HHT And Recognition
3	Applying Hilbert-Huang Transform To Speaker Recognition
4	Study On Time-Frequency Analysis Algorithm Based On Hilbert-Huang Transform Technology
5	Computer Simulation Study On Hilbert-Huang Transform Technology
6	The Application Of Hilbert - Huang Transform In Noisy Speech Processing
7	Study Of Speech Recognition System For Mandarin Digit Based On HMM
8	Speaker Recognition Based On EMD
9	Speaker Recognition Based On Continuous Hidden Markov Model
10	Improved Hilbert-Huang Transform And Its Application Of The Signal Process In Power System