Font Size: a A A

Research On Support Vector Machine For Speaker Recognition

Posted on:2007-08-13Degree:DoctorType:Dissertation
Country:ChinaCandidate:Z C LeiFull Text:PDF
GTID:1118360212456470Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Speaker recognition is the process of automatically recognizing who is speaking on the basis of individual information included in speech signals. As a kind of biometrics, it is an important research field of speech signal processing and studied by more and more researchers.A support vector machine (SVM) is a supervised learning technique from the field of machine learning applicable to both classification and regression. Rooted in the Statistical Learning Theory developed by Vladimir Vapnik and co-workers at AT&T Bell Laboratories in 1995, SVM is based on the principle of Structural Risk Minimization, and has got more attention in many different fields for its superior performance. The main idea of SVM can be concluded as the following two points: it constructs a nonlinear kernel function to present an inner product of feature space. It implements the structural risk minimization principle in statistical learning theory by generalizing optimal hyper-plane with maximum margin between the two classes.In this thesis, we develop the techniques for the text-independent speaker recognition using support vector machine. The methods can be divided into frame-based and utterance-based, which both are researched in different fashions.In the frame-based method, every frame is scored by the SVM and the decision is made based on the accumulated score over the entire utterance, which is widely used in the generative models. In this method, the inputs of support vector machine are the frame vectors. Training SVMs rely on quadratic programming optimizers, so it is not easy to large scale data on time and space consuming. In order to construct a small data set for training SVMs, some cluster algorithm is adopted to select the represent samples generally. We research the effect of the cluster algorithms, size of selected data, weights of class, scoring fashion, kernel functions, multiclass classification, probability output, and so on.In this thesis, the ensemble of support vector machines is applied to text-independent speaker recognition, and two models are proposed by adopted the boosting algorithm and mixture of experts ensemble ideas. The purposes of adopting these ideas are to deal with the large scale speech data and improve the performance of speaker recognition. The distance-based and probability-based scoring methods are used to score the new utterance. Compared with the conventional vector-based speaker models (Vector Quantization and Gaussian Mixture Model), our method is hyper plan-based. The experiments have been run on the YOHO database, and the results show that our models can get attractive performances.Another is utterance-based approaches which map an utterance into a vector as...
Keywords/Search Tags:Speaker Recognition, Support Vector Machine, Vector Quantization, Gaussian Mixture Model, Universal Background Model
PDF Full Text Request
Related items