Font Size: a A A

The Voice Command Recognition System Principles And Implementation

Posted on:2002-12-23Degree:MasterType:Thesis
Country:ChinaCandidate:Z H FuFull Text:PDF
GTID:2208360032453927Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Having been developed for almost a half century, the speech recognition technology is growing up and the content of which becomes more and more abundant. However, in spite of the enormous research efforts spent in trying to create an intelligent machine that can recognize the spoken word and comprehend its meaning, we are far from achieving the desired goal of a machine that can understand spoken discourse on any subject by all speakers in all environments. Parting from the difficulties of continuous speech recognition, the speech instructions recognition system based on the techniques of recognizing isolated word or phrase is turning to practice, and already has some applications. The detailed discussion on fundamental principles and approaches of the speech instructions recognition system are given here, as well as some technique problems when applying. First of all, a brief instruction of the whole system with the main function of each part and the comparison of several proposed approaches to automatic speech recognition by machine previewed here. Then comes the 1st main topic of this article on spectral analysis. There are six major classes of spectral analysis algorithms in speech recognition system today, which are derived from filter bank, Fourier transform and linear prediction methods. While discusses these, other techniques such as spectral shaping, parametric smoothing, vector quantization, short-time frame analysis and parametric normalization are also considered here. Through spectral analysis, the prolix information is compressed into a parametric vector for the following pattern recognition. DLlring spectral analysis discussion, we emphasize the filter bank method. For design a digit filter, the SPTOOL tools of MATLAB software are very convenient. Then we show a critical band filter bank designed on both mel and Bark scale. Comparison between FIR and hR filter, and how to deciding the type and number of filter group during practice are given. On the other hand, we talk about the computation of feature vector, different window function and frame-based overlapping analysis. At the end of this topic. the buffer of software and AID of hardware are discussed. Pattern-comparison technique is another main topic in this article. In this part, we mainly discuss the speech endpoint detection, i.e. how to separate acoustic events of interest in a continuously recorded signal from other parts of signal (e.g. background), the spectral-distortion measures that measure the difference between two spectral vector, and time alignment and normalization that measure the difference between two speech pattern consisting sequences of spectral vector. The Dynamic Time-Warping (DTW) solution and other considerations in DTW are discussed in details. Finally, summing and future trends prospecting for research and applications are given. This research is sponsored by Fund of Aviation Research and Project of Scientific Research of Shaanxi.
Keywords/Search Tags:speech instructions recognition, filter bank, linear prediction, spectral shaping, Dynamic Time-Warping (DTW)
PDF Full Text Request
Related items