Font Size: a A A

The Research Of Speaker Recognition Method Based On Cepstrum Features

Posted on:2011-08-02Degree:MasterType:Thesis
Country:ChinaCandidate:Y ZhangFull Text:PDF
GTID:2178330332963230Subject:Electronics and Communications Engineering
Abstract/Summary:PDF Full Text Request
Automatic Speaker Recognition (ASR) technology is one of the most important biometric technologies.Because it is low-cost, easy to use and effective ASR has potential applications in the fields of information, banks and securities trading institutions, public security and judicial systems, military, security and document security. But the speaker recognition research also faces many problems, which make its discussion and research necessary.In this paper, In order to improve the performance of speaker recognition system, the author firstly introduces the speaker recognition theory and the difficulties inherent in the technology. Then the author focuses on two key aspects, features and models used for speaker recognition.In the first aspect, we focus on researching the property of four kind cepstrums, Linear Prediction Cepstrum Coefficient (LPCC), Mel-Frequency Cepstrum Coefficient (MFCC), Perception of Linear Prediction Coefficient (PLPC) and its improved feature, Relative Spectral Transformational Perception of Linear Prediction Coefficient (RASTA-PLPC).In the second aspect, the theories of Gaussian Mixture Model (GMM) and Support Vector Machine (SVM) are introduced. And based on their properties, combined model is proposed. The specific of the working is as follow.(1)Summary and analysis of the performance of several features. In this paper, the author analyzes the four static features, their dynamic features and the combined features of the static features and their dynamic features and makes several speaker recognition experiments by clean speeches and noisy speeches. Based on theories and experiments, conclusion can be described as follow. In the all features, the combination of PLPC and its dynamic features has best performance in speaker recognition using clean speeches.And the combination of the RASTA-PLPC and its dynamic features has stronger robustness, manifesting well in speaker recognition using noisy speeches.(2)The research of feature transformation.In order to improve the feature performance, the author uses principal component analysis transformation on the initial features to stripping the none individual information of speakers. Though this attempt fails, this idea is an important method for feature research.(3)The research and improvement of speaker recognition model.GMM does well in describing data distribution, but needs lots of samples. SVM is a classfier designed for small samples, which performs well for classification, but takes too much time when there are too many classes.So a combined model is proposed, and the experiments show that it improves the some effects compared to GMM.
Keywords/Search Tags:Automatic speaker recognition (ASR), Biometric, Gaussian Mixture Model (GMM)Support Vector Machine (SVM)
PDF Full Text Request
Related items