Font Size: a A A

Research On Anti-noise Speech Recognition Methods Based On Robustness PLPC

Posted on:2012-06-16Degree:MasterType:Thesis
Country:ChinaCandidate:N N TangFull Text:PDF
GTID:2218330338953812Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
Clean speech recognition has reached a mature stage currently.The speech recognition rate can reach up of 95% for example of ViaVoice of IBM. But its voice inputting requirements of the background environment are so strict. Otherwise the recognition rate of the system will decline significantly. The reason is mainly attributed to the differences between the environments of training and testing. The differences cause the mismatch of the model and test data. The training data is obtained in the quiet environment of the laboratory for most of the recognition system now. But the system performance would drop dramatically when it is used in noisy environments such as the speech recognition of access control of the intelligent community. In order to put the technology of speech recognition into practical application instead of staying in certain circumstances, the speech recognition systems must have good robustness on different levels and types of noises.This paper divided the anti-noise speech recognition system into three stages based on the starting point of access management of intelligent community: preprocessing, feature Parameters extraction and speech recognition. This paper analyzes and realizes a new denoising model by the combination of Wiener filter and smoothing firstly. This model is used to remove the additive noise of man-made noise, natural noise, equipment noise and so on. In the preprocessing stage, we add window and detect endpoint to the speech signal before framing, then we denoise the signal with the denoising model. This model denoises the additive noise with the Wiener filter first and then remove the residual nonlinear distortion and some glitch problems that caused by the additive noise and channel distortion with the smoothing method.Perceptual linear prediction cepstrum coefficients (PLPC) is one of the parameters of speech features that commonly used in recent years. The PLPC has strong ability of anti-noise and Anti-spectral distortion because of the consideration of the human hearing mechanism. So the PLPC can achieve higher recognition rate than other parameters when applying to recognition system under noisy environments. In the feature extraction stage, this paper proposed the robustness perceptual linear prediction cepstrum coefficients (Robustness Perception Linear Prediction Coefficient, RPLPC) for the first time based on the traditional Parameter of PLPC. This parameter combines the discrete wavelet transform and short time Fourier transform together. Voice signal is one of the non-stationary time-varying signals. So the main means of analysis to voice signal is short-time Fourier transform. However, the extracted characteristic parameters that only used short-time Fourier transform is imperfect. Wavelet transform not only has good localization in both frequency domain and time domain, but also has the gradual breakdown step of frequency domain and time domain for the high frequency components. Thus the wavelet transform can focus on any details, and can extract the non-stationary information carried by the signal. Meanwhile, the wavelet transform is sub-band analyzed, so it can extract the characteristic parameter of strong anti-noise ability. Therefore, feature parameters obtained by the combination of wavelet analysis and Fourier transform are robust.The experimental result shows that, by the use of new denoising model and RPLPC, the speech recognition system has higher recognition rate when applied in the environment of additive noises, accordingly improving the robustness of the system.
Keywords/Search Tags:Speech Recognition, Wiener Filter, Smoothing, Robust Perceptual Linear Prediction Cepstrum Coefficients, Wavelet Transform
PDF Full Text Request
Related items