Font Size: a A A

The Research Of Speech Recognition Technology Based On FPGA

Posted on:2008-12-01Degree:MasterType:Thesis
Country:ChinaCandidate:Q Y XieFull Text:PDF
GTID:2178360242988980Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Many speech recognition systems are based on software, but more and more applications now require physical compactness, portability, in addition to low-power. Therefore, the dedicated speech recognition chip based on integrate circuit has an extensive development space. Current speech chips based on DSP cost too expensive, and lack of flexility in design, so the performance can't be improved more. FPGA(Field-Programmable Gate Array) has a lot of advantages such as low power consumption, small size, hign integration and speed, short development cycle, low-cost, User-definable function, programming and erasing repeatedly, so it has good performance in Parallel arithmetic.This paper studies how to realize algorithms of speech recognition with FPGA. The main task is as follows:A variety of FPGA design methods of digital processing algorithm are studied and realized, such as VLSI architecture design method; Matlab modeling of DSP hareware design method; IP core design method. Some basic computing function units based on hardware are implemented with thses moehods, and used for speech recognition algorithm.The front-end processing of speech recognition, including pre-emphasis, enframing, windowing and endpoint detecting. A method based on energy changing is proposed and improved by real-time enframing so it can perform well in real-time endpoint detecting as well as some antinoise capability.The feature extraction of speech recognition and its hardware design. The Mel Frequency Cepstrum Coefficient (MFCC) fully simulates the characteristics of the hearing, so it has high performance and antinoise capability in recognition. However, its computation is very complex including Fast Fourier Transform(FFT), triangular filter, logarithm and Discrete Cosine Transform(DCT). In this paper, the hardware design of each process has improved its speed. In FFT, by reducing FFT points of real number, the speed is improved by 40%. In triangular filter computation, the center frequency is converted into the corresponding point in frequency spectrum to get high calculating efficiency. In logarithm, the look-up table and linear interpolation are used to improve the precision. Finally, afrer analysis of the MFCC process, a three pipeline processing hardware structure is presented. It can perform triangular filter, logarithm and DCT almost parallelly, which accelerates the MFCC extraction speed. In Vector Quantization(VQ), the efficiency of codebook search is improved by compareing result with minimum.Viterbi recognition arithmetic and its hardware implemetation. The Hidden Markov Model(HMM) is used for modeling an matching, and it could be considered the most powerful technique in terms of computation and storage requirements. A method according to the HMM structure, which improved the formula of traditional Viterbi algorithm, can achieve high searching speed by pruning. Four ACS units are used for parallel processing, which simplify the circuit and improve the recognition speed.
Keywords/Search Tags:speech recognition, FPGA, HMM, MFCC, Viterbi
PDF Full Text Request
Related items