Font Size: a A A

Study On Speaker Identification System Based On Gaussian Mixture Model

Posted on:2009-04-11Degree:MasterType:Thesis
Country:ChinaCandidate:J WuFull Text:PDF
GTID:2178360242481118Subject:Signal and Information Processing
Abstract/Summary:PDF Full Text Request
Starting in 1930's, speaker recognition has increasingly become a hotspot of research since 1960's. It can be applied to a number of fields, such as security, justice, military affairs, finance and services. Because of that, lots of scientific researchers are involved in the research, making great development. However it is not ripe very much.It's one of the important research fields of speech recognition that using information extracted from the speech signal to perform speaker recognition. Based on the context of speech signal, speaker recognition can be divided into text-dependent and text-independent. It is very attracting of text-independent speaker identification due to more flexible and widely application. Speaker recognition is a kind of biological certification technology and it makes use of the speech coefficients which represent the speaker's physiological and physical feature to identify speaker. In the biological certification area, speaker recognition widely draws the attention because of its convenience, efficiency and accuracy.In the text-dependent speaker recognition, the GMM shifts the problem of speaker recognition to the problem of the estimation of distribution of training data. Thus, it divides more complex problems of data training and pattern matching into some simple problems, such as parameter estimation and computation of probability. Also, GMM has characteristics of simple, flexible and robust. So it is the-state-of-art in text-independent speaker recognition.This thesis studies the speaker recognition system with GMM (Gauss Mixture Model).Before processing audio signal, making some analysis about audio signal in time and frequency, such as short-time power, short-time passing zero rating, and introduced their application in audio signal processing, gain my end to deeply understanding of the audio signal processing. After these, picking up character parameter-MFCC, introduced MFCC's strong point in speaker indentification and application to it. At the end introduced GMM and EM arithmetic .Mixture Models are a type of density model which comprise a number of component functions, usually Gaussian. These component functions are combined to provide a multimodal density. They can be employed to model the colours of an object in order to perform tasks such as real-time colour-based tracking and segmentation . These tasks may be made more robust by generating a mixture model corresponding to background colours in addition to a foreground model, and employing Bayes' theorem to perform pixel classification. Mixture models are also amenable to effective methods for on-line adaptation of models to cope with slowly-varying lighting conditions . On the basis of performance tests and comparison, the classification algorithm improve the recognition rate. The study work of this thesis has several aspects:(1) Constituting the integrated system:On the basis of speech segmentation and recognition rate calculation, the impact of different length of speech units on recognition rate is studied to verify the system correction and reliability.The tests on pre-emphasis coefficient and windowing frame length in pre-emphasis processing are made to get the best pre-emphasis coefficient and the best frame length in GMM with different orders.(2) Study on the capability of system:On the same test condition, the impact of the order of GMM on system recognition rate is studied. The negative impact of higher or lower order is analyzed, and the choice is made according to the practical circumstance.Setting covariance threshold in EM algorithm iteration is put forward. The comparison of experiments on different threshold is made, finding 0.10 is a universal and practical value for the covariance threshold.(3) Amelioration on the system:Consider that the conventional EM algorithm has the defect of singularity matrix,Coefficientαis introduced to control the correction scaling in order to correct the result, and the efficiency of the improved algorithm on coefficient estimation is verified. In the end ,a conclusion of this thesis and the prospect of the future work are drawn.
Keywords/Search Tags:Speaker Identification, GMM(Gaussian Mixture Model), EM Algorithm
PDF Full Text Request
Related items