Font Size: a A A

Bilingual Named Entity Recognition Based Word Alignment And Machine Translation Research

Posted on:2010-02-09Degree:MasterType:Thesis
Country:ChinaCandidate:X ZhaoFull Text:PDF
GTID:2178360275494200Subject:Computer applications
Abstract/Summary:PDF Full Text Request
Named Entities(NE) were defined as proper names and quantities of interest. Person,organization,and location names were marked as well as dates,times, percentages,and monetary amounts.Bilingual Named Entities are two Named Entities that means the same thing in two different languages,and the recognition of them plays an important role in many applications such as Cross-Language Retrieval, word alignment and Machine Translation in Natural Language Processing area.Here we focus on studying bilingual Named Entity Recognition based word alignment and Machine Translation.The main work and innovative points of this paper are as follows:As to bilingual Named Entity Recognition,different from normal recognition methods,we proposed an iteration algorithm:first we abstract bilingual Named Entities from parallel corpora based on bidirectional word alignment information. Then in turn,the reliable alignment information between bilingual Named Entities is added into the word alignment process to improve the alignment results.After that, the alignment results are again used to abstract more bilingual Named Entities.Run this procedure several times until the amounts of the bilingual Named Entities don not increase any more.In this way,the amounts of the bilingual Named Entities are increased as one iteration goes after another.As to word alignment,we proposed the point of replacing bilingual Named Entities by their types,and then added their types into alignment dictionary.And bilingual Named Entities in text are also been replaced by their types.Our results show that both this and adding bilingual Named Entities themselves into the dictionary methods can improve word alignment,and the former one can makes the AER of word alignment decrease more.In Machine Translation,bilingual Named Entity Recognition is incorporated with it in two ways.The first one bilingual Named Entity Recognition is only added into the process of the training of the phrasal translation model.As to the second one, bilingual Named Entity Recognition is embedded into the entire machine translation process,which realizing a new entity type based machine translation method.We do our experiment basing on the phrase based machine translation system CARAVAN, which is development by Xiamen University.Results show,compared woth normal phrasal translation model,the BLEU scores of the two methods increase 5.05%and 17.27%relatively.
Keywords/Search Tags:Recognition
PDF Full Text Request
Related items