Font Size: a A A

Study Of Chinese-english Name Automatic Translation Method

Posted on:2013-04-20Degree:MasterType:Thesis
Country:ChinaCandidate:X X LiFull Text:PDF
GTID:2248330374954750Subject:Software engineering
Abstract/Summary:PDF Full Text Request
With the rapid development of internet technology and the deepening ofglobalization, Chinese Web texts contain a large number of Chinese names and foreignnames which are from other different nations. How to identify the source language of apersonal name in Chinese texts according to the structural features of names and thentransliteration it into English is a focus of applied research in machine translation andcross-language information retrieval.Based on the regularity of the characters and the structural features of differentsource languages names in Chinese texts, this paper makes a research in sources offoreign transliteration of names, including Europe and the United States, Japan, Koreaand China recognition methods, and then explores the automatic translation of Englishnames. This study not only improves the methods and theory of the name originrecognition in Chinese texts and Chinese-English personal names transliteration, butalso has a very broad application in the field of machine transliteration, questionanswering and cross-language information retrieval tasks. Specifically, the paper makesthe research following three aspects:First of all, taking character as the basic processing unit, the paper studies thestructural features and the regularity of the characters of the personal names of differentsources, such as Europe and the United States, Japan, Korea and China by usingstatistical method on the training corpus.Secondly, taking the Chinese name origin recognition as a classification problem,in analysis the characteristic of the personal names of different sources, which takingthe length of personal names, the location information of the names and the n-gramfeatures into account, and then realized system which based on maximum entropymodel(ME), support vector machine model(SVM) and the Nave-Bayes model to fusionthese features for identify the origin of a name. It focuses on exploring the impact ofdifferent classification model and different features for the effect of the name originrecognition. The experimental results show that the SVM model has the large advantages to the name origin recognition of China, Japan, Korea and the United Statesfour nations in Chinese texts.At last, taking the Chinese name origin recognition as the basic, this paper take arule-based Chinese-English transliteration unit alignment algorithm for Chinese-Englishtransliteration unit alignment. Under the grapheme-based transliteration framework,transliterate different sources of Chinese names into English with Hidden MarkovModels(HMMs). The experiment shows that personal names that with original labelwill get better results than the names without any original messages.
Keywords/Search Tags:name origin recognition, transliteration unit alignment, Chinese-English name transliteration, SVM, HMMs
PDF Full Text Request
Related items