Research On Linguistic Model Of Uyghur Continuous Speech Recognition System

Posted on:2010-09-29

Degree:Master

Type:Thesis

Country:China

Candidate:J Chen

Full Text:PDF

GTID:2178360275997993

Subject:Computer application technology

Abstract/Summary:

PDF Full Text Request

The chief aim of this paper's work is to investigate the application of Linguistic Model techniques in the Uyghur continuous speech recognition system.The current study of Acoustic Model in speech recognition system is relatively mature, and the Acoustic Model has limited capacity in processing speech signal. There are many pieces of non-acoustic information have not yet been good used in speech recognition, such as syntax, semantics, context, etc., so there is not much more room for the advancement of the research of Acoustic Model in speech recognition system. However, studies on Linguistic Model still have much more room to improve. Therefore, this paper chooses Linguistic Modeling techniques as the main research direction and makes a special discussion on the application of linguistic model in continuous speech recognition system.Firstly, it combines with the characteristics of Uyghur, proposed a new principle of collecting corpus based on Confusable Set. According to the original recognition result, we compute all phonemes included in Insertion error, Deletion error and Substitute error. Then find some suitable sentences to recruit the corpus.Secondly, it uses CMU_Cam_Toolkit to deal with the new corpus, selected 5500 words which more than 10 high-frequency to construct a dictionary, trained and generated trigram statistical linguistic model.Thirdly, it makes a comparison of four smoothing methods, observed the impact to the perplexity respectively and ultimately selected good_turing method as our data smoothing methods to optimize the model.Finally, it uses the new trigram linguistic model instead of the original bigram in the Uighur language continuous speech recognition system, which developed by key laboratory of multilingual information technology of Xinjiang University in 2008.The experimental results indicate that the application of the new linguistic model has improved the performance of the Uyghur continuous speech recognition system. The sentence recognition rate was increased to 73.29% from 68.98%, the word recognition rate was increased to 96.27% from 94.65%.

Keywords/Search Tags:

Uyghur, continuous speech recognition, Confusable Set, linguistic model, CMU_Cam_Toolki

PDF Full Text Request

Related items

1	Research On Uyghur Continuous Speech Recognition System Based On HTK
2	Research On The Technologies Of HTK Based Uyghur Continuous Phoneme Recognition
3	A Research On Uyghur Continuous Speech Recognition Based On Julius
4	Research On End To End Uyghur Speech Recognition Technology
5	Research On Human Computer Interaction Based On Speech Keyword Spotting
6	The Research Of Uyghur Acoustic Model Based On Deep Neural Network
7	CRF Continuous Speech Recognition Research And SVM
8	Acoustic Modeling For Continuous Speech Recognition
9	Research And Development Of Continuous Speech Recognition Based On HTK And Microsoft Speech SDK
10	Uyghur Speech Emotion Features Analysis And Recognition