A speech recognition IC with an efficient MFCC extraction algorithm and multi-mixture models

Posted on:2007-03-26

Degree:Ph.D

Type:Thesis

University:The Chinese University of Hong Kong (Hong Kong)

Candidate:Han, Wei

Full Text:PDF

GTID:2448390005467746

Subject:Engineering

Abstract/Summary:

Automatic speech recognition (ASR) by machine has received a great deal of attention in past decades. Speech recognition algorithms based on the Mel frequency cepstrum coefficient (MFCC) and the hidden Markov model (HMM) have a better recognition performance compared with other speech recognition algorithms and are widely used in many applications. In this thesis a speech recognition system with an efficient MFCC extraction algorithm and multi-mixture models is presented. It is composed of two parts: a MFCC feature extractor and a HMM-based speech decoder.; In the conventional MFCC feature extraction algorithm, speech is separated into some short overlapped frames. The existing extraction algorithm requires a lot of computations and is not suitable for hardware implementation. We have developed a hardware efficient MFCC feature extraction algorithm in our work. The new algorithm reduces the computational power by 54% compared to the conventional algorithm with only 1.7% reduction in recognition accuracy.; For the HMM-based decoder of the speech recognition system, it is advantageous to use models with multi mixtures, but with more mixtures the calculation becomes more complicated. Using a table look-up method proposed in this thesis the new design can handle up to 16 states and 8 mixtures. This new design can be easily extended to handle models which have more states and mixtures. We have implemented the new algorithm with an Altera FPGA chip using fix-point calculation and tested the FPGA chip with the speech data from the AURORA 2 database, which is a well known database designed to evaluate the performance of speech recognition algorithms in noisy conditions [27]. The recognition accuracy of the new system is 91.01%. A conventional software recognition system running on PC using 32-bit floating point calculation has a recognition accuracy of 94.65%.

Keywords/Search Tags:

Recognition, Algorithm, Efficient MFCC, Models

Related items

1	The Comparison And Analysis Of The Feature Extraction Algorithm Of Voiceprint Recognition System
2	Research On Speech Emotion Recognition Algorithm
3	Study On MFCC And Lasso Reverberation Suppression Of Feature Extraction Algorithm Of Speech Recognition
4	Research And Practice Of Speaker Recognition Based On GMM
5	The Research Of Robust Speech Recognition In Noise Environment Based On MFCC
6	The Study Of Speaker Recognition System Based On MFCC
7	Study Of Speaker Recognition System Based On MFCC And GMM
8	Based On The Improved Mfcc Parameters Of The Application Of Speech Recognition System
9	The Research Of Voice Processing And Recognition Algorithm Based On DSP
10	Research On Voiceprint Recognition Technology And Application Based On MT MFCC And Improved V Neural Network