Font Size: a A A

Research On Named Entity Word Alignment Between Chinese And English

Posted on:2007-05-03Degree:MasterType:Thesis
Country:ChinaCandidate:Y F TangFull Text:PDF
GTID:2178360185986102Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the development of computers and the internet, in the field of natural language processing, the application of bilingual named entity word alignment is growing. In addition to applications in machine translation, the bilingual named entity word alignment is also useful in terms extraction, information retrieval, translation dictionary compilation, and natural language generation. According to statistics the noun words which are out of found in available dictionary resources, are mosltly named entities, including some professional terms, because this part of the words update faster. Therefore noun words alignment especially bilingual named entities alignment are even more important.Traditional word alignment approaches cannot come up with satisfactory results for named entity alignment. In this paper, we proposed several named entity algorithm based on the available word alignment algorithms. Based on these results, the translation segment pairs are extracted to build the translation example base, then we applied the results in the EBMT systems and evaluated the translation results.This thesis is arranged as follows:In this paper we build a self-learning system for detection rules of named entity alignment using the transformation-based error-driven learning, which has succeeded in parsing. After building a large corpus that has 3529 bilingual sentences which was word aligned manually, the self-learning system retrieves a large amount of rules. Not only retrieving of these rules is easy to use but also the accuracy of automatic detection of named entity alignment can be raised when using these rules.In case of the insufficient corpus, we introduce the phrase Web frequency as a evaluation rule. First, we construct the query condition by using the named entities which aren't aligned in the sentence pairs, send it to the search engine automatically, then deal with relevant search results, and we can obtain a reliable alignments. The experiments show that the new method improves the recall of bilingual sentence alignment. We can see advantages and disadvantages of the two methods have strong complementarities in some extent. As we can seen,...
Keywords/Search Tags:named entity alignment, transformation-based error-driven learning, example-based machine translation
PDF Full Text Request
Related items