Font Size: a A A

Study For Speech Recognition System At Low SNR

Posted on:2015-02-06Degree:MasterType:Thesis
Country:ChinaCandidate:D X ZhaoFull Text:PDF
GTID:2298330467452621Subject:Electronics and Communications Engineering
Abstract/Summary:PDF Full Text Request
Robotics is an important research areas of modern science and technology, and speech recognition technology to achieve human-computer interaction is one of the key technologies in this field. With the rapid development of information processing theory and computer tech-nology, speech recognition technology both in theory and in practice has made great progress and development. However, due to the complexity and variability of the physical environment, the existing speech recognition technology needs to improve in order to achieve robust at low SNR environment and then it is more meaningful on human-computer dialogue.Voice recognition system primarily includes preprocessing, endpoint detection, feature extraction, training and recognition, we intend to make deeper study and research on the exist-ing main algorithm performance in low SNR which is based on the study speech recognition system, in order to study on the new anti-noise speech recognition system, the specific inno-vations and major contributions in this thesis as follows:1. It is difficult to detect unvoiced for traditional speech endpoint detection algorithm in the low SNR, while the robustness performance decreases, and so we propose snr-energy-zerocrossing-entropy speech endpoint detection algorithm. Firstly, even in a low SNR, speech frame still contains a high SNR on frequency or sub-band, but the noise frame not, we can get the maximum SNR from speech frames. Secondly, the differences of the speech signal entropy, energy and zero crossing rate between speech signal and noise signal is considered, and combined with the maximum SNR we obtain snr-energy-zerocrossing-entropy. Dynamic update factor also proposed to update the endpoint de-tection threshold, and the simulation results show that the parameter differences, which is obtained by snr-energy-zerocrossing-entropy, between the speech frame and noise frame is deeply sharp.2. The complexity of traditional speech feature extraction algorithm is high, and the anti-noise performance is not obvious, this thesis proposes a multi-scale MFCC feature ex-traction algorithm. In a noisy environment, people can still hear the content concerned, which is related with the auditory characteristics and perception characteristics. In this thesis, based on MFCC feature extraction algorithm and combined with perceptual char-acteristics of the human ear in LF, MF, HF, we can set the resolution and amplitude of the filter. The simulation results in the low SNR show that the dimensions of the vector sequence do not increase, while the identification accuracy is higher by the multi-scale MFCC parameter.3. For the identification of large-scale nonlinear high-dimensional data requires more mem-ory, longer processing time and other issues, this thesis presents NPASVM classification algorithm. According to SVM optimization problem is equivalent to the nearest point issue between the two convex polyhedra, we can convert non-linear data into linear data by NPA algorithm. The simulation results show that a relatively small amount of data is stored, and the processing time shorter by NPA algorithm.4. A Matlab-based platform is setup, on which intensive experience we carried out. The results confirm the validity and effectiveness of the proposed algorithms.
Keywords/Search Tags:endpoint detection, snr-energy-zerocrossing-entropy, feature extraction, multi-scale mfcc, npasvm
PDF Full Text Request
Related items