Font Size: a A A

Based On The Instance Of English-chinese Translation System

Posted on:2009-06-05Degree:MasterType:Thesis
Country:ChinaCandidate:L ZhouFull Text:PDF
GTID:2208360245461045Subject:Software engineering
Abstract/Summary:PDF Full Text Request
As an important branch of Natural Language Processing (NLP) in computer, the research of machine translation undergo a long and tortuous development. In the research of machine translation, there are variety of methods for it.With the development of processing speed and storage capacity of computer, Example-Based Machine Translation (EBMT) has obtained more and more reserchers' favor. EBMT makes full use of the translation information of original examples to translate by replace. So it avoids complicated deep-level linguistic and semantic analyses, and overcomes the difficulties of knowledge acquisition in Rule-Based Machine Translation (RBMT). Moreover, we can give high-quality translation results when there have some similar sentences with input sentence in corpus.Simple declarative sentence is the fundamental of other forms in English language, so simple sentence is the focus of the research of Example-Based Machine Translation. Therefore, in this thesis, we only research simple english sentence, and not consider state and other information.Firstly, this thesis present history and current status of machine translation. Secondly focused on introduce basic principles of Example-Based Machine Translation (EBMT), system components, and several related issues in Example-Based Machine Translation research : the establishment of bilingual corpus, match the most similar examples, recombine the target sentence.The main content of this thesis is divided into four parts: create bilingual corpus, storage and indexing of bilingual corpus, sentence similarity calculation, match fragment of sentence and produce translation of input sentence.In the basic of the research, I try to implement a English-Chinese system of simple sentence in this thesis. The main process is: for input sentence or sentence fragments, firstly, extract keywords from it. Secondly, match sentence by search this keywords, then get the corresponding examples of translation. Thirdly, calculation similarity of sentences that searched from corpus with input sentence. Lastly, select most relatedness sentence to replace and insert, then recombine this to give translation sentence. At last, the translation system implementation, experiment results and evaluation were introduced. Based on experiment results, can give high-quality translation results when there have some similar sentences with input sentence in corpus.
Keywords/Search Tags:machine translation, EBMT, bilingual aligned trees, sentence similarity
PDF Full Text Request
Related items