Font Size: a A A

Phasor Machine Stability Key Speaker Recognition

Posted on:2009-02-05Degree:MasterType:Thesis
Country:ChinaCandidate:H H CongFull Text:PDF
GTID:2208360245461112Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
The goal of the speaker recognition is to decide which person is talking from a group of known speakers by the features extracted from the speech. It is a biometric technology and considered one of the most natural biometric identification methods. Because voice is the inherent characteristic and very natural for the formation, training and test do not need a special input device. The voice becomes an accepted biological characteristic with its lower price. Despite the current speaker recognition technology has made considerable progress, but there are many practical problems to be solved.In the paper, speaker recognition based on Support Vector Machine (SVM) was researched. Because of the perfect discriminability, SVM has excellent performances in speaker recognition. And it can achieve a generalisation performance that is better than or equal to other classifiers. However, SVM becomes inefficient when the the number of training patterns is large. In order to overcome the limitation of the traditional SVM, speaker recognition based on Gaussian Mixture Model (GMM) was researched. A new approach for speaker recognition based on SVM and GMM was proposed. The experimental results show that the new method performs better than the other two methods.Recently, Jayadeva and R. Khemchandani proposed a nonparallel plane classifier for binary data classification. They termed it as Twin Support Vector Machine (TWSVM). This algorithm aims at generating two nonparallel planes such that each plane is closer to one of the two classes while as far as possible from the other. In the paper, a new approach that uses TWSVM is proposed for text-independent speaker recognition. It is a combination of the generative model and the discriminative model. But it is different from the usual approaches. Firstly, the proposed method extracts features from the training data based on GMM. Then TWSVM models are trained with the features extracted by GMM. Since the approach reduces the number of features, it's more efficient for large scale dataset than traditional SVM. At the same condition, it performs better than SVM and almost as effective as GMM in terms of recognition rate. The recognition rate of GMM reduces quickly with the decreasing of the number of the training patterns. But the proposed method has not this disadvantage as it combines the discriminnative model. In other word, the proposed approach can get the stable recognition rate using fewer training patterns. This advantage makes the proposed method more efficient and suitable for situations with small sample numbers. In addition, the proposed approach also has the ability to deal with large dataset. When the number of patterns in database is increasing, the recognition rate of TWSVM reduces slowly. Excellent experimental results also show the success of our method for speaker recognition. Based on above methods, I developed a speaker recognition system. The paper describes the design, the implementation and the function of the system.
Keywords/Search Tags:Speaker Recognition, Support Vector Machine, Twin Support Vector Machine, Gaussian Mixture Model
PDF Full Text Request
Related items