Font Size: a A A

Noise Robust Technologies InSpeech Recognition

Posted on:2004-09-27Degree:DoctorType:Dissertation
Country:ChinaCandidate:P DingFull Text:PDF
GTID:1118360122467303Subject:Communication and Information System
Abstract/Summary:PDF Full Text Request
Prevailing speech recognition systems can obtain a very high accuracy for clean speech recognition, but their performance will degrade rapidly in noisy environments owing to the mismatch between the acoustic models and the testing speech. Therefore, noise robust technology is a very crucial problem for the real application of speech recognition.Speech recognition is very sensitive to additive background noise. One of our contributions is to propose a noise robust algorithm based on the compensation of speech enhancement distortion, i.e. we efficiently combined several robust methods to improve the robustness of speech recognition in noisy environments. In signal space, speech enhancement is adopted to effectively suppress the noise and increase the discriminative information embedded in noisy speech signal. However, the speech distortion introduced by enhancement, as well as the residual noise, is a very adverse factor for recognition. It can be concluded after analysis that the speech distortion and the residual noise can be approximately regarded as multiplicative noise and additive noise, respectively. Thus, we use a parallel model combination (PMC) algorithm, which is deployed in model space, to adapt the parameters of speech models to compensate the residual noise, or we use a cepstral mean normalization (CMN) algorithm in feature space to compensate both the speech distortion and the residual noise. PMC and CMN are most efficient at moderate signal-to-noise ratio (SNR), so from another viewpoint, the noise reduction process in speech enhancement will be beneficial to PMC and CMN in making the system more robust. The effective combination of robust methods in different space can significantly improve the accuracy of speech recognition, especially at low SNRs.With the rapid growth of communication in wireless/computer networks, the research of robust speech recognition in impulsive noise environments has been a new hot topic. After the analysis of Viterbi decoding, we conclude that the adverse effect of impulsive noise on recognition is that the impulsive noise introduces unreliableprobability difference. Therefore, we propose a novel method to directly remove the effect of impulsive noise by flooring the observation probability of noise sensitive feature sub-vector at the Gaussian mixture level. Feature partition and threshold assignments are two cruccial problems of the proposed algorithm. A method is proposed to measure the noisy sensitivity of each feature dimension, which leads to proper feature partition schemes. Another method is also proposed to approximately calculate the flooring threshold of noise sensitive feature sub-vector, in which the integration of multidimensional Gaussian distribution is converted into the sum of series. Moreover, the flooring threshold given by the above method is very close to the exact optimum value. The proper feature partition and the optimum threshold will be efficient in both removing the probability difference and keeping more recognition information. The proposed method can significantly improve the performance of recognition system in impulsive noise environments, while maintaining the high accuracy for clean speech recognition. Weak dependence of noise characteristic and very low computation overhead are two important practical advantages.
Keywords/Search Tags:speech recognition, noise, robustness, combination, flooring observation probability
PDF Full Text Request
Related items