Font Size: a A A

Research On Some Key Issues Of Speaker Recognition In Noisy Environment

Posted on:2014-09-21Degree:DoctorType:Dissertation
Country:ChinaCandidate:D AnFull Text:PDF
GTID:1318330482454618Subject:Detecting technology and automation devices
Abstract/Summary:PDF Full Text Request
Speaker recognition which is based on biological characteristic information is one of the research focuses in voice processing field. Currently, part of its scientific achievement has gone into production and has been successfully applied in the fields of E-Commerce, authentication of remote customer service and military security, etc. However, most speaker recognition technique is based on pure voice environment. Whenever the voice is influenced by outside noise, its recognition accuracy will decline sharply. This paper summarized the research achievement completed by other researchers and researched the speech decomposition, speech de-noising, feature extraction and optimization for model of speaker recognition in the noise background.Considering speech decomposition, mode mixing which comes across decomposition of EMD was analyzed theoretically and basic conditions of none-mode mixing of EMD were concluded. This paper put out an improved EMD decomposition method, that is, IEEMD decomposition method. Based on those basic conditions, the definition of white noise amplitude and the number of iterations of the original algorithm were amended.1600 random pieces of speech were selected to carry out EEMD and IEEMD from T1MIT speech database, and results showed that IEEMD cost 7s-9s while EEMD cost 50s-63s to reach the same decomposition effect.As to the speech-denoising, this paper proposed TFast ICA method, which used third-order Newton iterative method instead of second-order Newton iterative method, to solve occasional misconvergence of ICA in speech processing. This paper also gave out the mathematical proof of convergence. By analyzing real speech signal, TFast ICA method satisfied the requirements of convergence and misconvergence didn't occur due to random selecting of initial separation vector w. Finally this paper proposed IEEMD-TFast ICA method based on the IEEMD and TFast ICA by combining the advantages of IEEMD and TFast ICA. Contrast tests were carried out among IEEMD-TFast ICA, EEMD-ICA and wavelet transformation speech-denoising by TIMIT speech database and noisex-92 noise database, and results showed that IEEMD-TFast ICA method was much better than the other two.As for speaker individual character feature extraction problem, this paper proposed an improved CFCC, that is ICFCC, which acceded asymmetry and intensity dependence of basilar membrane to the extraction process of characteristic parameters. Contrast tests were carried out among ICFCC, MFCC-SDC and CFCC by using GMM-MMI speaker recognition model and TIMIT speech database, and results presented that the recognition accuracy could still reach over than 70% in a-10dB SNR of babble noise environment by using ICFCC while the recognition accuracy could only reach to 55% by using CFCC and 9% by MFCC-SDC, showing better robustness.This paper also presented an improved PSOA algorithm for parameters optimization of speaker recognition model. Firstly a new inertia weight stratagem was applied to extend global search time of PSOA by maintaining a larger value in early iteration and extend local search time by maintaining a smaller value in late iteration. Also this paper put out a new position updating formula of particle swarm by including momentum factor of inertia weight adjustment stratagem, in which way the population was not easy to fall into local optimal value and the accuracy of optimization results gained great improvement. Mathematical analysis proved that POSA this paper proposed was correct and superior. Five different test functions were used in contrast tests for those five improved PSO optimization, and results put out that the PSOA this paper proposed were better than the other four, both in robustness and accuracy. The training time when PSOA was used in parameter optimization of SVM was as short as 324.7812s, at the same time average recognizing accuracy reached up to 83.28%. All those prove that SVM optimized by PSOA owned excellent robustness, noise immunity and classification.IEEMD-TFast ICA speech-denoising method was designed as SOPC and built on Altera DE2-115 developing board finally. The SOPC architecture was given out and IEEMD-TFast ICA was realized by hardware software co-design. Firstly the TFast ICA IP core was designed and realized using Verilog HDL, and then IEEMD was built based on Nios ? processor, and finally experiments were carried out both in simulation and real speech test, proving that IEEMD-TFast ICA speech-denoising method was effective and efficient.
Keywords/Search Tags:Speaker Recognition, Ensemble Empirical Mode Decomposition, Independent Component Analysis, Cochlear Filter Cepstral Coefficients, Particle Swarm
PDF Full Text Request
Related items