Font Size: a A A

The Research Of Robust Text-independent Open-set Speaker Recognition

Posted on:2007-07-16Degree:MasterType:Thesis
Country:ChinaCandidate:C Y ZhaoFull Text:PDF
GTID:2178360185466943Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
Language is the most important tool for human intercommunion. Speech signal as the carrier of language embodies much information in different level. By the information of speaker, who is the speaker can be identified or whether the speaker is the claimed one can be verified. In modern day, automatic speaker recognition has performed quite perfect. But the mismatch between the training data and the test data that result from all kinds of noise in real environment make the speaker recognition rate dramatically declined. As a result, improving the performance when Signal-to-Noise (SNR) is much lower is the key for the system come to practice from laboratory.The technology of speaker recognition is composed of feature extraction and pattern classification. This paper researched pronunciation organ and hearing organ to understand speech robustness. In addition, some primary classifiers are intensively researched. All of works are extended for text-independent open-set speaker recognition in noisy environment.Considering information entropy that is comprehensively applied to code theory represent average unconfirmed information source, the entropy of speech and the entropy of noise must be different. This paper applied entropy function to speech segmentation. The experiment result shows the spectrum entropy performed much well in low SNR and unconfirmed noisy condition. Further, a dynamic threshold is brought forward to execute phonetic segmentationConsidering noise frequency spectrum rarely overlay all that of speech, this paper uses multi-subband feature extraction and uses sub-cepstrum based Teager energy in every subband. Furthermore, a hybrid system of Support Vector Machine (SVM) and Gaussian Mixed Model (GMM) is introduced. Firstly, this system applies SVM to every subband, thus the speaker that does not be belong to training set is filtered out. Then, the feature vectors of speakers in training set are weighed by the score that is determined by SVM, so the subband features that influence recognition more are given prominence to. Finally, the weighed features that combine together are provided to GMM for final decision. The experiment result shows this system performed still well in lower SNR condition.
Keywords/Search Tags:Speaker Recognition, Mel Scale, Sub-cepstrum, SVM, GMM
PDF Full Text Request
Related items