Font Size: a A A

The Design And Realization Of A Phrase-based Statistical Chinese-English MTS

Posted on:2010-02-15Degree:MasterType:Thesis
Country:ChinaCandidate:X F HeFull Text:PDF
GTID:2178360278973162Subject:Software engineering
Abstract/Summary:PDF Full Text Request
This paper mainly introduces the current status of statistical machine translation system, and make theoretical discussion on the current dominating research methods of statistical machine translation that based on phrase. In this paper, it first introduces the first statistical machine translation system that based on phrase—Pharaoh, which is to make the people have a visual knowledge about the system.This paper introduces the design process of statistical machine translation system that based on phrase through data models and illustrations. It make introduction to the alignment of training corpus, phrase extraction, decoding translation for the auto-extracted phrases and word order adjustment models under different constraint rules, and make sufficient theoretical design to the concrete realization for the statistical machine translation system that based on phrase.Through the establishment of data model and the division of system module, by using of domestic and international resources currently available, including some open source tools and some publicly available mandated tools, we achieved the integration of the phrase-based statistical machine system. It is includes Chinese word segmentation tool, word alignment module, the English word segmentation tools, language model tools, and training corpora, word-aligned corpus, the phrase translation probability table format, language model format, the input / output file format and reference the answer defined format. We achieved a match between the various modules, and standardized the operation of the date on the whole system operational. We extracted some from the corpus as a part of the development set and test set on evaluation links, by using the existing evaluation tools to test the accuracy and BLEU evaluation value for integrated phrase-based statistical machine translation system.We designed the Phrase based on the the postal area of special entries of the Chinese-English machine translation system - Post translation system. Based on the above ideas and the development of open source by useing of existing resources, supplemented by memory, dictionary and other modules, we have developed a server-client model based on the the postal area of special entries of the Chinese-English machine translation system - Post translation pass, The system can provide users with a convenient user interface, users can dynamically add a custom template, dictionary, etc. to guide and correct translation of the results of the background, at the same time can be the translation of documents in bulk. The model that based on phrase can achieve good performance in translation task and deserve further research.
Keywords/Search Tags:statistical machine translation, phrase, translation model
PDF Full Text Request
Related items