Font Size: a A A

A Study Of Complex Structure Alignment And Development Set Selection Stratagy For Machine Translation

Posted on:2013-01-30Degree:MasterType:Thesis
Country:ChinaCandidate:C HuiFull Text:PDF
GTID:2218330362959274Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
Machine translation is the core techneque in cross-language corperation and com-munacationinmodernworld. Itplaysanimportantroleinculture,science,religionandsociology. Learning how to translate from large scale data is called statistical machinetranslation. One important part of statistical machine translation is Alignment, whichextracts linguistical structures, such as word, phrase, syntax and semantics, from thesetence pair in two languages to guid the translation. And another problem resultedfrom machine learning is domain adaptation, which will effect development set se-lection used to optimize model parameters: different development set will exert a biginffuence on the quality of translation. This article will focus on these two problem-s. For the ffrst problem, this article statistics the usage of alignment module of singletranslation system in the previous statistical machine translation workshop share task,and carries a compareble study via experiments using task's data, and then shows thatthe phrase alignment is main stream in current statistical machine translation align-ment systems; meanwhile, for the domain adaptation problem in statistical machinetranslation, this artical proposes two evaluations, the difference of best translation er-rorandtheBLEU-RECALL,toselectthedevelopmentset, andexperimentsshowsthattranslation performance has signiffcantly improved.
Keywords/Search Tags:Statistical Machine Translation, Alignment, Phrase Syntax, Domain Adaptation, Development Set Selection
PDF Full Text Request
Related items