Font Size: a A A

The Research On English-Chinese Name Entity Translation

Posted on:2012-12-21Degree:MasterType:Thesis
Country:ChinaCandidate:M M ZhaoFull Text:PDF
GTID:2218330368991827Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Named Entity (NE) translation is an important sub-task in multilingual language processing, such as Machine Translation and Cross-lingual Information Extraction. Especially in a Statistical Machine Translation system, NE translation is an important factor reinforcing the system performance. Different types of NE have different translation characteristics. Person Name and Location Name Translation is mainly implemented by transliteration, The Combination of translation and transliteration is employed to translate Organization Names.This thesis concentrates on English-Chinese Person Name Transliteration modeling methods and Web-based Name Entity Translation, The contributions of this works is summarized as follows:Statistical Machine Translation-based and Machine Learning-basedEnglish-Chinese Person Name Transliteration modeling methods Name transliteration problem is transformed into a general sentence translation problem by The Statistical Machine Translation-based Transliteration model. Two machine translation approaches: the phrase-based model and the N-Gram model are applied to transliteration modeling problem. In the Machine Learning-based transliteration model, transliteration problem is transformed into a sequence-labeling problem. We test two Machine Learning methods: Maximum Entropy model and Conditional Random Fields. We compared the performance of five modeling methods. Machine Learning-based model proves to give a better performance and the Conditional Random Fields get the best accuracy.Transliteration and Web-based Name Entity Translation MethodWe propose a Person Name translation mining method, which makes use of transliteration model results as heuristic query expansion to improve the quality of the snippets. High quality snippets enhance the Name Entity translation inclusion rate. We compare the performance of Model-based transliteration method and Transliteration and Web-based method. Experiment show that the second method gives a better performance. The second transliteration method fixes incorrect Chinese character in transliteration results of model-based method.Web-based Organization Name translation mining method We propose a method which can extract the Chinese translation for a English Organization Name from bilingual webpage. The words of Organization Name is aligned by a method which named alignment-anchor expansion-based .and then a greedy algorithm is used to extract phrase and word translation pairs to build bilingual dictionary from aligned ON pairs. ON translation is extracted from webpage using the extracted bilingual phrase and word dictionary. We compare the performance of machine translation-based method and Web-based method. Experiment show that the second method gives a better performance.
Keywords/Search Tags:Machine Transliteration Model, Machine Learning, Web Mining, Statistical Machine Translation, Word Alignment
PDF Full Text Request
Related items