Font Size: a A A

Research On Robust Speech Recognition Method Of Agricultural Market Information Acquisition

Posted on:2016-03-29Degree:DoctorType:Dissertation
Country:ChinaCandidate:J P XuFull Text:PDF
GTID:1108330461989445Subject:Crop Science
Abstract/Summary:Request the full-text of this thesis
At present,speech recognition has been made more fruitful achievements, satisfactory performance in a quiet environment, gradually being applied in the case of many human-computer interaction. However, due to the presence of noise, the performance of speech recognition system shows a sharp decline in the actual environment, how to improve the noise robustness of speech recognition is becoming a research hotspot in recent years. This paper mainly studies noise robust speech recognition problems in the agricultural market information collection working environment, facing with the lack of speech recognition engine specially for the field of agricultural market information collection, and on the other side many recognition algorithms for common areas may not be suitable to solve the problem. Based on the characteristics of the noise environment, for independent speakers and medium vocabulary Mandarin continuous speech recognition, we research noise robustness methods and finally train HMM models. The main contents of this study include:(1) we establish acoustic models based on HMM, and use the self-built agricultural market information corpus to train and test HHM models. Using HTK toolkit, we develop a baseline speech recognition systems for agricultural prices.(2) On the basis of analysis the noise characteristics of the agricultural market information collection working environment, from model space and feature space we take a variety of robust methods on the system, including: for the selecting of acoustic modeling unit, we propose a extended initial/final(X-IF) triphones model, effectively solve the coarticulation problem of internal syllables and between syllables, greatly improve the recognition rate; for the number sharp increase of triphone models, we use decision tree state clustering method and establish a binary problem ruleset to integrate expertise knowledge and phonetics rules into the decision tree. In this way, the number of triphones is reduce and the problem of insufficient training data is effectively solve; in view of good performance of CMN in eliminating channel convolution noise and additive noise terms, we apply CMN method to alleviate the effects of channel noise.(3) In the signal space, in order to improve SNR of input signal, we use a spectral subtraction speech enhancement algorithm, but spectral subtraction algorithm is easy to bring the channel distortion and "music" noise. In order to reduce this distortion, we propose a robust method with combined spectral subtraction algorithm and feature compensation. In this method, the cepstral mean normalization(CMVN) and spectral subtraction algorithms are complementary each other. Experiment results show that the combined algorithm can effectively improve the recognition rate of the system, especially at low SNR.(4) In the framework of statistical estimation theory, we study the minimum mean square error(MMSE) and logarithmic minimum mean square error(log MMSE) amplitude estimator. On this basis, we propose a robustness method with combined MMSE(or log MMSE) magnitude estimator and CMVN distortion compensation. Experiments show that at the different agricultural market information collection environment, this method is effective and has varying degrees of noise robustness. Combined algorithms in many space can improve the robustness of ASR system,especially in low SNR.
Keywords/Search Tags:Speech recognition, Agricultural market information, Information collection, Speech enhancement, Noise robustness
Request the full-text of this thesis
Related items