Font Size: a A A

On The Normalization And Romanization Of Dai Language Texts For Textual Translation

Posted on:2016-04-10Degree:MasterType:Thesis
Country:ChinaCandidate:X X HuFull Text:PDF
GTID:2208330470454092Subject:Communication and Information System
Abstract/Summary:PDF Full Text Request
The Dai ethnic group, an ethnic group inhabited across border, is one of the important ethnic groups in China with a long history. Dai is an independent language developing on the Sino-Tibetan Languages, belong to the Sino-Tibetan language family of the Zhuang-Dai branch, and the type is the same with Chinese single syllable which is a tone language. In this article, we aim to study the development of Dai Text to Speech System. We research on the Dai textual analysis modules and focuses on the study of Dai word Normalized and Romanization. The main works in this paper are as follows:1. we analysis Xishuangbanna Dai voice system, Use Teleport Ultra software to extract text from crude corpus Dai News.and build text corpus and dictionary based on the literature [25].2. Dai text normalization was studied. We analyze and classify the nonstandard-words of Dai Text such as Arabic numerals, English letters and proper noun abbreviation, we propose rule-based method of combining words to realize Dai text normalization in the special character processing module.3. This paper analyses the Xishuangbanna Dai phonetic System in detail, and formulates the rule of Romanization which is used to convert the word of Dai to Roman characters; We build the tone auto-tagging rule of Dai syllable according to the pronouncing rules of Dai language; We propose the romanized transliteration scheme of Dai Text, and design the algorithm flow detailed in this paper.For the purpose of developing a Dai speech synthesis system, the Dai text normalization and Romanization method was studied in this paper. For the problem of text normalization, this paper provides a method that it could fairly well solved the normalization of common non-standard words. The accuracy of complicated and ambiguous non-standard words still remains to be improved. On the whole, the proposed scheme for normalization could basically meet the demand of developing a Dai speech synthesis system. For the problem of text Romanization, the transcription of algorithm could fully meet the system requirements and be applied to language engineering.
Keywords/Search Tags:Dai, Text to Speech, Text Analysis, Text Normalization, Romanization
PDF Full Text Request
Related items