Font Size: a A A

Research On Speech Recognition In Noisy Environment

Posted on:2008-09-19Degree:DoctorType:Dissertation
Country:ChinaCandidate:Q LongFull Text:PDF
GTID:1118360242464760Subject:Precision instruments and machinery
Abstract/Summary:PDF Full Text Request
With the aim of applied speech control technology and the emphases of system robustness, this thesis deeply discusses every main aspect of embedded isolated-word speech recognition technology in noisy environment. Through the systematic research and experiment on robust speech recognition problems, a complete research system of robust speech recognition is formed, which includes every key part such as experiment platform, robust endpoint detection algorithm, robust feature extraction algorithm, feature compensation algorithm, acoustic model etc. And some significant results are obtained. All the achievements are fully implemented and verified in the speech corpus. At last a complete isolated-word speech recognition system is constructed, which includes speech corpus, software program, hardware implementation, and application system. Based on it, a practical speech control system can be developed directly. These results mentioned above can be described concretely in the following aspect:(1) Speech recognition experiment systemA speech recognition experiment system based on the Hidden Markov Models (HMM) is constructed. The implementation of the HMM algorithm in the system is optimized for the isolated-word speech recognition. A scheme of the word selection for the experiment of robust speech recognition is given, which can ensure the representativeness of the experiment. A complete speech corpus and noise corpus together with the noise measure standard is built for isolated-word speech recognition, which can ensure the repeatability of the experiment.(2) Endpoint detection algorithmAiming at the shortness of the traditional double-threshold endpoint detection algorithm in noisy environment, some improvements are given. A nonlinear dynamics parameter, Permutation Entropy (PE) is applied in robust speech endpoint detection firstly and a double-threshold endpoint detection algorithm based on energy-frequency-ratio and permutation-entropy-difference is proposed. Experiments based on the speech platform are conducted to compare this algorithm and the traditional algorithm. The results indicate that this algorithm is more robust than the traditional one with almost same detection delay.(3) Feature extraction algorithmSeveral common feature parameters for speech recognition are systematically summarized. The principle, implementation details, advantages and disadvantages of feature based on Linear Prediction Coding (LPC) and Mel Frequency Cepstral Coefficient (MFCC) are analyzed in detail. Aiming at the problems of LPC feature and MFCC feature, the spectrum estimation method based on Minimum Variance Distortionless Response (MVDR) is introduced into speech feature extraction. This method has both the advantages of LPC and MFCC in some degree. Several improvements in computation are given according to the characteristic of speech signal. The performance of MVDR method is compared to other methods by experiments.(4) Robust speech recognition technologyThe system robustness problems including environmental noise resistance, speaker adaptation and channel adaptation are fully researched. A robust feature extraction algorithm for speech recognition was proposed. This algorithm is based on the MVDR spectrum estimation method. It estimates MVDR spectrum at Mel frequency scale and filters the modulation spectrum of the MVDR spectrum, then the cepstral coefficients are extracted as the feature parameter. Experiments were conducted to compare the proposed algorithm with MVDR and MFCC feature extraction algorithms, under different levels of car noise, babble noise and gauss white noise. The results indicate that the recognition accuracy of this system has been improved at some degrees under the three noisy conditions.(5) Hardware implementation problem .The four schemes including general purpose processor, Digital Signal Processor (DSP), Application Specific Integrated Circuit (ASIC), Field Programmable Gate Array (FPGA) are compared together for hardware implementation of isolated-word speech recognition algorithm. A hardware implementation design based on FPGA is developed. The complete design flow, design scheme and test scheme based on FPGA for isolated-word speech recognition are given together with detailed specifications of every module. The design of peripheral circuit is also given. So a complete speech recognition system can be fully implemented in hardware form.
Keywords/Search Tags:Robust speech recognition, Feature extraction, Hidden Markov model, Permutation entropy, Minimum variance distortionless response, Modulation spectrum, Feature compensation, Field programmable gate array
PDF Full Text Request
Related items