Font Size: a A A

Speaker Recognition In Noisy Environments

Posted on:2008-02-16Degree:DoctorType:Dissertation
Country:ChinaCandidate:Z Q QiuFull Text:PDF
GTID:1118360215497782Subject:Circuits and Systems
Abstract/Summary:PDF Full Text Request
The Research of speaker recognition having been applied for several decades, so some techniques have matured. Especially text-dependent speaker recognition has been used in commercial. But text-independent speaker recognition is more difficult than text-dependent speaker recognition because the context is unknown in advance in former case, which is the attractive facet. So it becomes the research hotspot. Moreover the satisfactory results in pure speech environment are acquired for speaker recognition, but the recognition rate decreases in noisy environment. It involves the robustness of the speaker recognition system. So in this thesis the research of speaker recognition in noisy environment is applied mainly.The main achievements are listed as follows:(1) We have done two modifications for speaker recognition system with noise environment. First, in order to improve the robustness of the systems, noisy speech is decomposed into various frequency bands and then de-noising is processed by TEO in every frequency band. The wavelet coefficient is processed by weigh according to the characteristics of speaker recognition. Finally, wavelet coefficient is transformed into MFCC. Second, in order to improve recognition performance and training velocity, modified OGMM that orthogonal transform is processed before EM arithmetic is applied at the recognition stage. Thus, it is not necessary to do orthogonal operation every time during the EM iterative process. The experiment's results are showed that the parameters proposed have produced better effect. Applying to modified OGMM further improve recognition performance and training velocity.(2) KLT has been applied to the characteristic extraction of text independent speaker identification, but it need huge computation load for eigen-decomposition. In order to abate computation load, KLT and overlap sub-frame are used to apply the effective speaker identification in the additive noise environment. Based on the separate method, an effective technique that is established feature matrix and acquired the validity of KLT technique is proposed. The experiment is showed that the compute multiplications to acquire overlap sub-frame KLT are 12 k p3 + 11pM 3 + M 2k. Overlap sub-frame KLT needs fewer compute multiplications than traditional KLT that needs 1 2 k 3 compute multiplications. In traditional MCE method, each of classification error needs compute K-1 decision functions with the increment of K for the system with K speakers, which make computation multiplications increase greatly. A kind of modified MCE model to decrease computer multiplications and further to enhance computer velocity is proposed. Several comparing experiments are done. Compared with GMM, the identification rate adopted KLT/MMCE improves obviously. Especially when the hybrid number is up to 128, the system identification rate achieves 98.5%. Therefore, the experiment result is showed that the proposed method can indeed decrease the computation multiplications and improve the identification rate of the system.(3) This paper adopts a novel wavelet de-noising method for speech with noise of front-end processing, moreover, in germ of the character of speaker recognition, before the reconstruction of the wavelet, the wavelet coefficient is processed with weigh; in the processing of recognition, GMM recognition arithmetic is adopted. The experiment is showed that the proposed method has more advantage for speaker recognition in noise circumstance than the speaker recognition with simple MFCC. The proposed method has very good guidance for real time speaker recognition.(4) The nonlinear combination of Gaussian function can describe a great deal of sampling distribution, therefore GMM has the advantage of effective computer easy to realize, especially in the real time condition. Based on ML rule, the model parameter updates ceaselessly and till observes some limiting point of sequence probability. Though in fact due to mountain climbing character, arbitrary original model parameter estimate usually lead local optimization. GA is a powerful global search tool that develops in recent years. It fits for resolving complicated combination optimization and nonlinear function optimization. In this paper, GMM/GA novel arithmetic that is based on speaker recognition and can resolve GMM local optimization problem is proposed. The experiment is showed that the GMM/GA novel arithmetic can acquire better result than pure GMM arithmetic.(5) In this paper, two modifications for speaker recognition are presented. The goal of de-noising is to remove the noise and to remain as much as possible the important features. Recently, signal de-noising using non-linear processing, for example, wavelet transformation have become increasingly popular. First, for threshold in the wavelet domain, a semi-soft threshold function that showed the advantages over hard and soft threshold function with respect to variance and bias of the estimated value is used. GMM require at least several minutes of training speech, which is not comfortable for real-world applications. On the other hand, ANN based classifiers, show better performance for telephone speech and need less training data than the GMM-based ones. Second, PNN and GMM are combined to improve the performance of the system. The experiment is showed that the proposed method has more advantage for speaker recognition in noise circumstance.
Keywords/Search Tags:Speaker recognition, Wavelet de-noising, Self-adapted wavelet, DWT-TEO, GMM/GA arithmetic, DWT/KLT, PNNGMM arithmetic
PDF Full Text Request
Related items