Font Size: a A A

Research And Implementation On Uyghur To Chinese Personal Name Transliteration Based On Syllabification

Posted on:2015-01-29Degree:MasterType:Thesis
Country:ChinaCandidate:L M M L T AFull Text:PDF
GTID:2285330422475551Subject:Chinese Ethnic Language and Literature
Abstract/Summary:PDF Full Text Request
Uyghur to Chinese automatic characters transliteration (UtoC) is an importantissue for minority language information processing, which also have vital and positiveimpact on Machine translation and information retrieval. In recent years, lacking ofunified standard on the Xinjiang minority personal names’Chinese characters, when itcomes to transliterate Uyghur personal name into Chinese characters, which causemany inconveniences such as people have different Chinese names on residentialbook, passport, id card and remittance bill.in order to solve these problem, this papermainly made a comprehensive analysis on grapheme based DOM transliteration frameand Uyghur syllabification, and on this basis it mostly studied Uyghur to Chinesepersonal name transliteration, the main content of paper are as follows:1. Firstly, we discussed practicability of Uyghur to Chinese personal nametransliteration under grapheme based DOM transliteration frame.it is seen thistransliteration frame matches source language word directly into target language word,and process of UtoC is to match Uyghur letter and syllables on personal directly to thecorresponding Chinese characters, and thus, mapping of Uyghur letters and syllablesto Chinese characters can be implemented taking the best advantage of thistransliteration frame.2. Article summarized Uyghur syllabification principle and implemented Uyghursyllable statistical system on the basis of basic theory and key technology of Uyghursyllabification. In order to give a range of syllable on Uyghur personal name, we used5000Uyghur personal making statistic, and finally, presented top20common syllableon Uyghur personal name.3. Based on grapheme base transliteration frame, we designed basic idea of UtoC,and on the analysis of Uyghur to Chinese phonetic table structure and put forward thefastest, most effective method for mapping based on matrix. We selected5000Uyghurpersonal name randomly and tested the system, and achieved only52%accuracy.4. In addition to improve lower accuracy, paper made a survey of mass Uyghurpersonal names and found106affixes which construct Uyghur personal name, andbuilt an additional rules based on name affix which can distinguish name’s sex. Usingthis rules, we tested the same personal name for the second time, at the end overallaccuracy increased30%, which can showed our approach have got a feasibility andeffectiveness.
Keywords/Search Tags:Modern Uyghur Language, Syllable Segmentation, Uyghur to ChinesePersonal Name Transliteration, Automatic Translation
PDF Full Text Request
Related items