Font Size: a A A

The Research And Application Of Phrase-Based Statistical Machine Traslation System

Posted on:2008-09-29Degree:MasterType:Thesis
Country:ChinaCandidate:H X MiaoFull Text:PDF
GTID:2178360212483677Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Machine Translation (MT) is regarded as a focus and difficult problem in the field of natural language processing and, it has theoretical and practical sense in international communication and cooperation. By reviewing the domestic and abord research status of machine translation, we analyze the technology related to statistical machine translation (SMT) and implement a Machine Translation System in this thesis, which we apply to the titles' translation in aviation domain. The results show that we have done a good job.The innovation of this thesis lies in combining some open tools, while putting up a Statistical Machine Translation System based on phrase. The whole process includes the preprocess of the corpus, the training for parameters of the model, the execution of translating course, and the auto evaluation of translation results. Thus an intact translation procedure has been implemented.Works in this thesis mainly include:First, the preprocess of the corpus. The processing of the corpus directly influences the quality of the translating result as statistical machine translation usually based on the bilingual corpus. This thesis carries the preprocess on the Chinese and English corpus seperately.Second, based on doing some researches on the theories of SMT, the work makes use of some current resourceses and tools which are available. We implement a SMT system by using the method of phrase-based model and introduce how the system runs and the parameter settings.Third, we applied the system to the translation job in aviation domain by combining the characteristic of the title in aviation domain and, made better translation result than other counterparts.Fourth, we studied the auto evaluation technology of machine translation. In natural language processing, the auto evaluation gets more and more attention. So, this work has evaluated the translation result of the title in aviation domain on the basis of studying the autoevaluation technology.On the other hand, The research and experiment of this thesis have proved the validity of phrase-based SMT method. At present, someone is exploring the translation model based on the complex lexical structure of the language, and the exploration has already become a new research. We believe that it is possible to improve machine translation performance via this kind of exploration. It is also a key point of our next research.
Keywords/Search Tags:Statistical machine translation, Phrase, Corpus, Statistical model, Auto evaluation
PDF Full Text Request
Related items