Font Size: a A A

Study Of Speaker Recognition System Based On MFCC And GMM

Posted on:2007-01-26Degree:MasterType:Thesis
Country:ChinaCandidate:A M DingFull Text:PDF
GTID:2178360182488542Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
As one of the biometrics techniques, speaker recognition is the process of automatically recognizing who is speaking on the basis of individual information included in speech waves. Because of its particularly advantage on convenience, economy and extensibility, this technique can be applied to a number of areas, such as banking by telephone, telephone shopping, database access services, security control for confidential information and remote access to computers. Because of that, lots of scientific researchers at home and abroad are involved in the research. This paper focuses on the speaker recognition system based on Mel-Frequency Cepstrum Coefficients (MFCC) and Gaussian Mixture Model (GMM).The impulse response of the vocal track is an important feature of a speaker. A speech signal is a convolution of source signal (an impulse train) with the impulse response of the vocal track. This paper introduces two methods to get cepstrum coefficients by deconvolution: linear predicition coefficient analysis and homomorphic transformation. After deconvolution, we can extract the Cepstrum Coefficients related to the impulse response and form the feature vectors.By introducing the human auditory system, this paper gives a new method to extract cepstrum coefficients by bending the power spectrum with the Mel-frequency scale. In the process of extracting Mel-Frequency cepstrum coefficients (MFCCs), we use a filter bank, which is consistent with the distribution of critical band of human cochlea, to mimic the human ears non-linear characteristic with frequency. This paper gives the theory basis and processing arithmetic to compute Mel-frequency cepstral coefficients in details, and proves the validity of this method by comparing the performance between MFCC and traditional LPCC in speaker recognition experiment.GMM is one of the best pattern recognition techniques because of its good performance, simpleness and lower degree of complexity. This paper introduces the concept of GMM, processing arithmetic of computing mixture model parameters and the method in speaker recognition system by using GMM, and also analyses the performance of different numbers of mixture model by experiment.In addition, the MFCC feature with normalized shot time energy and dynamic information is discussed based on the MFCC feature and the influence to the identification performance is analyzed.
Keywords/Search Tags:speaker recognition, feature extraction, MFCC, GMM
PDF Full Text Request
Related items