Font Size: a A A

The Study On Phrase-Based Statistical Machine Translation System

Posted on:2008-11-16Degree:MasterType:Thesis
Country:ChinaCandidate:B F ZhangFull Text:PDF
GTID:2178360245491811Subject:Pattern Recognition and Intelligent Systems
Abstract/Summary:PDF Full Text Request
In the past ten years, statistical methods have been more and more popular in the research of machine translation. Various statistical methods continue to emerge. The performance of statistical machine translation system has been greatly improved. In recent years, statistical machine translation systems achieved good results in some machine translation evaluation test. Statistical method has gradually become the mainstream in machine translation field.In this paper, some major methods of statistical machine translation are researched and discussed, including source-channel model and maximum entropy model.This paper discussed the detail about the design and implementation of phrase-based translation system.In statistical machine translation field, the phrase-based translation model outperforms the word-based translation model. In all current phrase-based machine translation models, all possible phrase segmentations of a source sentence obey a uniform probability distribution. Then all phrase segmentations are sent into decoder to get optimal output. It is a pity that these phrase segmentations in uniform distribution don't use the source linguistic knowledge at all. These phrase segmentations in uniform distribution may guide the translation model to choose wrong target candidate phrases. This paper proposes a novel phrase segmentation probability model via statistical learning from source linguistic knowledge. This model is used to guide the decoder to get proper phrase segmentations for source sentence. The phrase segmentation probability model is considered as an independent feature and added into the statistical machine translation system based on maximum entropy model easily. The experiments in Chinese-English and French-English translation prove that this method can improve performance distinctly in statistical machine translation system.The alignment template phrase-based translation model and the standard phrase-based translation model are the two typical presentations. These two models use alignment template and phrase-pair respectively in translation process. This paper takes a brief comparison about these two models. Alignment template and phrase-pair have their advantage respectively. Alignment template can handle data sparseness problem better, and phrase-pair can make the translation process more exactly. This paper proposes a method to simulate alignment template at the standard phrase-based translation model based on maximum entropy. The simulation of alignment template is considered as a feature and added into this translation model. The experiments on Chinese-English and French-English translation tasks show that this method improves the performance distinctly in standard phrase-based translation model.
Keywords/Search Tags:Statistical Machine Translation, Phrase Segmentation, Alignment Template, Standard Phrased-Based Translation Model, Maximum Entropy
PDF Full Text Request
Related items