Font Size: a A A

Research On Speaker Recognition In Complex Environment Based On Signal Sparse Decomposition

Posted on:2016-04-27Degree:MasterType:Thesis
Country:ChinaCandidate:T T WuFull Text:PDF
GTID:2208330461982912Subject:Optical engineering
Abstract/Summary:PDF Full Text Request
Speaker recognition is the process of automatically recognizing the identification of the speaker by his/her own special biometric information included in speech signals, which stands out among various kinds of biometric authentication technologies because of its natural advantages such as economy, non-contact, universal, distinctive features. Existing speaker recognition technology has good performance in noise-free environment. However, noisy environment makes training and recognition speech features mismatched, which severely degenerates the recognition performance. How to effectively improve the robustness of recognition system becomes the key point for this technology.As an important theory branch in signal processing, sparse decomposition is widely used in de-noising, compression, parameter estimation, time-frequency analysis, blind source separation and many other aspects. Based on the sparse decomposition theory, we study corresponding solutions for speech signal de-noising problems under different noise backgrounds and apply these methods to speaker recognition system as part of pre-processing so as to improve the performance of speaker recognition system. The main works of this paper are as follows:1. Construction of speaker recognition system based on vector quantization technology. The system structure is firstly introduced. Then each part of the system implementation is described with special attention on the key steps such as feature extraction, template training and recognition. Finally, system parameters to achieve the best recognition rate are debugged through simulations.2. Study of the sparse decomposition theory in speech signal de-noising applications. The sparse representation and signal reconstruction are introduced with emphasis on basis pursuit algorithms among diverse kinds of convex relaxation reconstruction methods. Meanwhile, DCT transform is selected as the sparse representation basis due to its low sparsity. Then sparse decomposition based on DCT basis is applied to speech signal de-noising. Simulations demonstrate the basis pursuit algorithms can enhance the AFSNR of the reconstruction signals effectively.3. Analysis of the speaker recognition system performance with sparse decomposition de-noising methods under different noise backgrounds. The de-noising methods, which are utilized in the pre-processing stage of speaker recognition system, depend on the boundedness of noise variance. For broadband noise with bounded variance, such as uniform noise and Gaussian noise, the de-noising method is based on the sparse decomposition with DCT basis. For impulse noise with unbounded variance, such as symmetrical α-stable distribution noise, the de-noising method is based on the sparse decomposition with a union time-frequency dictionary. Simulations demonstrate that these two methods can substantially enhance the recognition rate of speaker recognition system under low SNR or generalized SNR.
Keywords/Search Tags:sparse decomposition, speaker recognition, speech de-noising, discrete cosine basis, union time-frequency dictionary
PDF Full Text Request
Related items