Font Size: a A A

Research And Implementation Of Chinese Continuous Speech Recognition System

Posted on:2011-12-26Degree:MasterType:Thesis
Country:ChinaCandidate:L P ZhangFull Text:PDF
GTID:2208360305959313Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
Speech recognition is a technology that can extract character symbols from the speech signal through computer processing. Chinese continuous speech recognition has been researched for almost 60 years at home and abroad. Although there are some achievements in continuous speech recognition research, a lot of problems need to be resolved. Existing speech recognition technologies can not reach the target that human can communicate with machine by natural speech. Large vocabulary, speaker-independent continuous speech recognition is still the difficulty and keystone of speech recognition.The main study of this paper is the key technologies of Chinese continuous speech recognition. Firstly the paper introduces the principles of speech recognition, the composition of speech recognition system and the basic knowledge of Chinese Speech. And then introduces the functions and key technologies of pre-processing,feature extraction pattern matching and post-processing of speech recognition. Improved methods have been proposed in view of problems existed in traditional methods. In this paper, the main works are as follows:1) A medium-vocabulary, speaker-independent Chinese continuous speech recognition system is achieved on a personal computer, using Microsoft Visual C++, MATLAB, Microsoft SQL Server and other tools, experiments are made on the system. The system chooses the initial and final as recognition unit, the MFCC as feature parameters, and using the DTW as recognition model.2) The accuracy of I/F segmentation has great influence on system, current I/F segmentation methods achieve high degree of accuracy in non-continuous speech, but fall in continuous speech. In this paper, a new method based on entropy and the formant energy of Chinese vowels is proposed, which can accurately segment the I/F.3) The speech recognition system with the traditional Dynamic Time Warping technology has great amount of calculation and long response time. For this problem, two improved methods of DTW are introduced:an improved method based on template threshold and an improved method based on feature vector threshold. The results of experiments show that the new methods can reduce the calculation time and improve the response time.
Keywords/Search Tags:Speech Recognition, Endpoint Detection, Mel Frequency Cepstral Coefficient (MFCC), Dynamic Time Warping(DTW)
PDF Full Text Request
Related items