Font Size: a A A

Robust Speech Recognition Based On The Combination Of The Denoising Method

Posted on:2013-06-21Degree:MasterType:Thesis
Country:ChinaCandidate:J M ZhaoFull Text:PDF
GTID:2248330371496165Subject:Communication and Information System
Abstract/Summary:PDF Full Text Request
Speech recognition is a technology that changes the input speech signal into the corresponding text or command, and then make the machines understand human languages. At present, the speech recognition system has achieved very good results in quiet laboratory environment, but, in the practical application, due to the disturbance of background noise, the training and recognition environment don’t match each other very well, and the mismatching severely reduces the system’s recognition performance. Robust speech recognition is a technology which aims to reduce the negative influence of the recognition system that is caused by the mismatching. At present, the technology of improving speech recognition’s robustness involves three aspects of anti noise processing, which are signal space, feature space and recognition model space. Being aimed at the influence of additive noise, a method based on the combination of de-noising method for robust recognition technology is given in this thesis.First, in the signal space, aiming at wavelet threshold de-noising, an improved threshold function is given in the thesis, to solve the problems of signal being sensitive oscillation after the treatment of hard-threshold functions and signal being over-distortion after the treatment of soft-threshold functions. The improved threshold function not only has good continuity, but also has no constant deviation. According to the different singularities between speech signals and noises for the different wavelet scales, in another words, based on the facts that modulus of wavelet transform coefficients of speech signals increases while the wavelet transform coefficients of noises decreases by the increasing of wavelet scales, an adaptive thresholds is given. It is proved by the experiment that the SNR of the signal is increased which is processed by the improved method of the thesis, and the the robustness of the recognition system is also improved.Then, in the character space, the traditional MFCC is a kind of speech character that conform to the auditory characteristic of human, and DCT is used in the process of extracting MFCC, but DCT take fix window width in the time an frequency space of speech signal, so DCT doesn’t match the characteristic of speech signal well. The wavelet transformation has the multi-resolution characteristic and is better conform to the auditory characteristic of human. Therefore, this thesis combines the wavelet transformation with the MFCC, and then a improved speech recognition features is given. It is proved by the experiment that, compared with the original MFCC, the improved feature parameter improves the recognition rate of the system, and it has stronger robustness.At the end of this thesis, because of the limited effect getting from improving speech recognition robustness on a single aspect, the signal space and the feature space anti-noise technology is combined, and the technology of wavelet transform is introduced, then a combined de-noising method based on wavelet analysis is given, which improves the robustness of the speech recognition system in noise environment. In the experiment section, in the first step, based on combined de-noising method, the thesis sets up a small vocabulary and isolated word speech recognition system. In the second step, the pronunciation signals of Chinese number0-9, which have different SNR(noise is the Gauss white noise) are recognized. After the research, the result turns out that the method proposed by the thesis is effective.
Keywords/Search Tags:speech recognition, robustness, wavelet threshold de-noising, characteristicparameter extraction, combination de-noising
PDF Full Text Request
Related items