Font Size: a A A

Research On Mandarin And Uyghur Speech Synthesis In Xinjiang Rural Information Pushing System

Posted on:2017-03-06Degree:MasterType:Thesis
Country:ChinaCandidate:Y F HanFull Text:PDF
GTID:2335330488969819Subject:Agricultural information technology
Abstract/Summary:PDF Full Text Request
According to the actual demand of Xinjiang rural information push system, first of all, the system need to convert text information into audio, then play it through the big horn and FM radio broadcasts in rural areas. The text information is pushed through SMS, the system designed by this research is composed of four functionality parts, namely, SMS receiving module, text normalization, corpus construction, waveform concatenation synthetic.To begin with, it implemented a message receiver module based on wireless module- SIM900 A designed by SIMCom Company. When receiving a message, call back and get user’s password by decoding keyboard input through MT8870 chip DTMF decoder, then verify it. On the terminal, the system manages the pushing of information in time. This part focus on message reception based mobile SMS.Then, text preprocessiong has been disscused. It has identified the different languages by the specific location of the language character in Unicode. On the views of engineering and technology, it took a regularization processing based on the rules of Mandarin and Uyghur documents, then under the Mandarin dictionary and Uyghur syllable library, it took the forward maximum matching algorithm segmentation on information of Mandarin and Uyghur respectively.Futhermore, this research collected about 530 thousand Mandarin vocabulary and pronunciation, more than 7 thousand Mandarin words and pronunciation; and about 6 thousand Uyghur syllable; Take a step further efforts, to solve the boundary problem in corpus of audio files- speech endpoint detection, it proposed a universal speech endpoint detection technology, and marked the speech endpoint of all phonetic files.Finally, it took the waveform concatenation synthetic with audio files selected from corpus by the key value of Mandarin SMS text vocbulary or Uyghur syllable. In the process of waveform concatenation synthetic, it took a smooth processing to smooth the audio waveform and prevent the noises from the splice points. Then using the prosody model and long control strategy, the prosody of speech had been optimized.Overall, the message receiving function has been completed, and this research realized the function of Mandarin and Uyghur text to speech and others. After testing, the effort of text to speech meets the design requirements certainly.
Keywords/Search Tags:information pushinig system, SMS receiving module, speech synthesis, text normalization, speech endpoint detection, waveform concatenation
PDF Full Text Request
Related items