Font Size: a A A

Research Of Segmentation Based Chinese Continuous Speech Recognition Technology

Posted on:2011-02-22Degree:MasterType:Thesis
Country:ChinaCandidate:B Q ZhangFull Text:PDF
GTID:2178330332978673Subject:Military Intelligence
Abstract/Summary:PDF Full Text Request
Continuous speech recognition has made great progress as a key man-machine conversation technology. It's a focus on syncretizing the knowledge of acoustics,phonetics and linguistics with the statistics based continuous speech recognition system. The deletion and insertion errors of the system are mainly from the unprecise endpoint detection. Aiming at resolving this problem, this paper discusses the speech segmentation technology and searching algorithm, and then completes a Chinese continuous speech recognition system based on segmentation knowledge. The major contributions of this paper are as follows.This paper discusses the characteristic of the male's formant in continuous speech. The statistical comparison of the 8 vowel formants from 10 males is investigated. The results show that F2,F3,and F3/F2 are valid parameters for the discrimination of pure vowels.This paper designs a baseline Chinese continuous speech system based on hidden markov model, and then the performance of different features ,such as formant,LPC,LPCC,MFCC,PLP are investigated. The results show that it's finer to use cepstral feature dealed with psychological acoustic theories.This paper presents a method to segment Chinese initials and finals based on the detection of auditory events. According to this method, the voice should first of all be filtered by using the cochlear filter bank, and then the auditory events corresponding to energy mutation in each band are detected and integrated in different frequency ranges respectively in order to determine the candidate boundaries. Finally, the voiceless-consonant initial,voiced-consonant initial,zero-initial syllable and ordinary final are syncopated in the sequence of binary tree.The experimental results show that under 8KHz sampling frequency,the accuracy is 88.9% for clean speech and above 82.9% for noisy speech with the SNR of 10dB.This paper presents a new Chinese continuous speech recognition search tactics on the basis of segmentation, which find the best word sequence through two stages of acoustic decoder and language decoder. According to this method, firstly single syllable syntax net and double syllable syntax net are proposed on acoustic decoder to get two different forms of results, and then the A* algorithm and the token passing algorithm are utilized on language decoder. The experimental results show that under the conditions of acoustic decoding with double syllable syntax net, and language decoding with token passing algorithm, the deletion and insertion errors decrease obviously, and also it gets a better performance on the accuracy rate.
Keywords/Search Tags:speech recognition, hidden markov model, formants, auditory events, initial and final segmentation, syntax net, word graph search algorithm
PDF Full Text Request
Related items