Font Size: a A A

A Design Execution And Recognition Testing Of Multimedia English Database Of Second Language

Posted on:2008-08-04Degree:MasterType:Thesis
Country:ChinaCandidate:Y L SuFull Text:PDF
GTID:2178360215466874Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Automatic speech recognition (ASR) is a multi-discipline research subject. It is becoming a key technology of man-machine interface in the information word gradually. Nowadays, ASR has made a great progress toward to provide high accuracy for users. Due to its especial status of English, there are so many English databases of first language in the world. These databases play a major role for progress of ASR. As globalization progress, more and more people who use it as a second language. In order to provide high performance for these people, it is necessary to design a English database of second language.Different country has different language with special pronunciation. It will influence the English pronunciation of people who use it as second language. In this paper we investigate how to design and create an English database of second language in China. Then we have collected data and tested it in the HTK.In this paper, we have done the work as follow:1,Constructed the HTK ASR system in Linux operate system.2,We use Mel Frequency Cepstrum Coefficient (MFCC) which can reflect phonetic characteristic in the automatic speech recognition system. In this way ,the dynamic feature of speech signal has been considered. The experiment proves the method of this kind of characteristic parameter of the increment mixture, can make the system recognition rate has been greatly improved. We did some research of improvement to the characteristic's parameter, compared recognition rate of various parameter, and got the characteristic parameter when recognition rate reached the highest.3,We used the Hidden Markov Model(HMM) to train model. We tested different states, and found when the states got 10 the system can reach the best effect.4,we have introduced the progress of how to design and create the speech database and how to train the speech model. Then we have compared the testing data between standard data (AVICAR data) and collected data. We have discovered that the recognition rate for English digit fell greatly for Chinese speakers in our database. This demonstrates the necessarity of building such database. Then we analyzed the reason of low recognition rate. In the end we compared the testing data between different area in China, and sum-up the reason of different recognition rate. Our investigation provides, experience for designing and creation of speech database of English as second language.5,We picked Chinese speech data out from TIDIGIT and join once, two, three times into the data of the AVICAR to train model, and compared the different models. We have discovered that the recognition rate has been improved. This demonstrates using suitable model will raise recognition rate consumedly.
Keywords/Search Tags:automatic speech recognition (ASR), recognition rate, man-machine interface, multimedia speech database, Mel Frequency Cepstrum Coefficient (MFCC)
PDF Full Text Request
Related items