Font Size: a A A

Study On Speaker Recognition In Short Utterance Condition

Posted on:2013-04-04Degree:MasterType:Thesis
Country:ChinaCandidate:P ZhuFull Text:PDF
GTID:2248330362462713Subject:Communication and Information System
Abstract/Summary:PDF Full Text Request
Speaker recognition is one of the most hot issues in biological pattern recognitionresearch. With a lot of scholar’s unremitting efforts, it formed a relatively completetheoretical system, but in practical applications, it still need to overcome a lot ofproblems, such as: the system is not robust, there doesn’t have enough speech data, therecognition rate still needed to improve, and so on.First, we elaborated the preprocessing of the speech signal, and we focused on thekey preprocessing step that is endpoint detection. As the detection result is poor when weuse the two-door detection algorithm in noise environment, under the premise of withoutsignificantly increasing the calculation of the system, we use MFCC0 as the endpointdetection feature, the feature is a discarded feature when we calculate the MFCCparameters, however, experiments show that the algorithm have nice endpoint detectionability.Secondly, we introduced the traditional feature of speech signal. As the fact that theMFCC feature is expensing the frequency resolution of the high frequency part in orderto obtain a higher frequency resolution of the low frequency part, and the wavelet packettransform can further decompose the high frequency part in order to increase thefrequency resolution, also the teager energy operator can decrease the effect of the noise,so we divide the speech signal into two part: the low frequency part and the highfrequency part, as for the the low frequency part we use the method of extracting MFCCfeature to extract the parameter; as for the high frequency part we first decompose it intomany small sub-band, and we use TEO to extract the sub-band energy, then we constructa hybrid parameter. Compared to the MFCC parameter, the recognition rate is rising.Again, in the speaker modeling aspects, as for the problem of the SVM put thelargest number of votes as the output and in fact that the real speaker almost always liesin the forefront of voting but not always the first, we output the possible speaker seriesinstead, and we use the GMM to do the speaker verification work, then we construct aSVM/GMM model , the experiments show that the algorithm is effective. Finally, as for the problem of the speaker recognition rate declined when the speechdata is very short, we put forward three compensation measures in the sample domain,feature domain and model domain, these are adding a high signal-to-noise ratio to theclean speech in order to expand the number of speech samples, using combinations offeatures instead of single feature to compensate the lack of speech data, using theintegration model instead of the single model. Compared to the single feature of the SVMmodel, the experiments results indicate that the improved method effectively improvedthe recognition rate.
Keywords/Search Tags:speaker recognition, hypid feature, SVM/GMM model, short utterance, noise addition, feature combine
PDF Full Text Request
Related items