Font Size: a A A

Research On Preprocessing Of Robust Speech Recognition

Posted on:2009-05-25Degree:MasterType:Thesis
Country:ChinaCandidate:Y WangFull Text:PDF
GTID:2178360272456589Subject:Control theory and control engineering
Abstract/Summary:PDF Full Text Request
Robust Speech Recognition extracts the essential features of speech signal to recognize and confirm the noisy speech. The preprocessing of robust speech recognition is studied here, the purpose of which is to eliminate the noise interference and extract"clean"signal parameters. This paper mainly include following three parts: endpoint detection, speech enhancement and feature extraction.Firstly, endpoint detection is studied, the purpose of which is to pick the"meaningful"speech parts out and avoid the interference of noise from silence parts. Some classic endpoint detection methods are discussed here, such as: short-time energy, average zero-crossing rate, double-threshold detection, spectral entropy, power spectral entropy and spectrum variance. The related results all show the characteristics of their own. By analyzing the faults of spectrum variance, a modified endpoint detection method is proposed, namely sub-band spectrum variance method. Finally, the experimental results prove its superiority.Secondly, speech enhancement is studied, the purpose of which is to improve the SNR and intelligibility of speech. It is a key step to realize the robustness of speech recognition system. This part mainly studied some different de-noising methods like wavelet soft-threshold de-noising and wavelet hard-threshold de-noising. The Bionic Wavelet Transform which consider the auditory perceptual is deeply studied, then, the idea of threshold de-noising is applied to Bionic Wavelet Transform, so, a new speech enhancement method based on bionic wavelet transform is presented. The results indicate that the proposed method outperforms some classic methods including spectral subtraction, wiener filtering and threshold de-noising based on Discrete Wavelet Transform in four kinds of realistic noise environments, and has a better enhancement performance.Finally, feature extraction is studied, the purpose of which is to remove the redundant parts of speech and extract the essential features of speech for recognition. It mainly studied some common characteristic parameters of speech like LPC, LPCC and MFCC. MFCC and LPCC are compared here. It's proved that MFCC is better than LPCC as characteristic parameters in representing the speech by constructing an isolated word recognition platform.
Keywords/Search Tags:endpoint detection, speech enhancement, wavelet transform, threshold de-noising, feature extraction
PDF Full Text Request
Related items