Font Size: a A A

Gaussian Mixture Model-based Speaker Recognition Algorithm

Posted on:2010-07-28Degree:MasterType:Thesis
Country:ChinaCandidate:K YanFull Text:PDF
GTID:2208360275498887Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Speech is the most convenient, fast and natural tool to communicate with other people. In recent thirty years, with the development of science and technology, the research of speaker recognition technique has achieved many productions, which will bring us more convenience in our daily lives.This paper starts around the construction of Gauss Mixture Model (GMM) speaker recognition system. On the basis of performance tests and comparisons, this thesis modifies the modules of initialization and recognition statistics to improve the recognition rate. There are several aspects can be shown in the following:(1) Constituting the integrated system: It is a project of GMM speaker recognition system which is made by using C++ and Matlab mixing programming. It includes voice acquisition module,voice preprocessing module,feature extraction module,parameters of training module and recognition module.(2) Study on the capability of system: Studying the impact on system of different features and GMM parameters. At the respect of feature extraction, there are some characteristic features, such as, LPC,LPCC and MFCC. The experiments show that the system can gain better performances adopting MFCC than LPCC and LPC. The impact of the order of GMM on system recognition rate is studied, too. The negative impact of higher or lower order is analyzed, and the choice is made according to the practical circumstance. Setting covariance threshold in EM algorithm iteration. The comparison of experiments on different threshold is made, finding 0.1 is a universal and practical value for the covariance threshold.(3) Improvement on the system: Consider that a small number of isolated points would have a considerable impact on the clustering results, there is an improvement to the k-means algorithm by separating the clustering center from the clustering seed, and verify the efficiency of the improved algorithm. Besides, the triangular inequality principle is used to reduce the running time of k-means algorithm. At the respect of recognition, because of the changes of speakers' features and the impact of noise and other interface, the scores of non-target models may be greater than the target model. So the paper presents a method to weight speech frames' score while recognizing, and verify the efficiency of the method.
Keywords/Search Tags:Speaker Recognition, Gauss Mixture Model, Clustering algorithm, Likelihood
PDF Full Text Request
Related items