Font Size: a A A

Research On Interactive Machine Translation Technique Based On Multiple Positive Constraints

Posted on:2017-03-09Degree:MasterType:Thesis
Country:ChinaCandidate:Y T FuFull Text:PDF
GTID:2348330482981570Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Machine Translation(MT) is a process of using computer to transform a natural source language into another natural target language. After decades of research, the MT technology has achieved considerable progress. However, state-of-the-art MT systems are still unable to produce translation that can meet the high-quality requirements of many practical applications.Under such circumstances, researchers proposed a new technology, namely Interactive Machine Translation(IMT). Specifically, a translator can accept the prefix in the MT output to guide decoding and generate new translated result. After constant interaction, finally the translator accepts the whole sentence, and the IMT process ends. However, the prefix-based IMT system use prefix as the only constraint, and this strict requirement limited the effect of human in guiding the decoding, in which the human guidance is quite insufficient. In fact, a translator can also provide more useful information in the suffix to guides decoding other than prefix, and this will possibly generate better translation results.In this paper I extended the interaction ways in prefix-based IMT. In addition to confirming and revising a prefix, the translator can also provide multiple correct fragments(CFs) as multiple constraints to guide decoding. As the CFs are added, we provide five improvements to prefix-based IMT system. First, beam search is adopted instead of multi-stack decoding to increase the decoding speed. Second, when translation hypotheses are extended according to the CFs and the prefix, new extending rules are used to decide whether the new hypotheses meet these constraints. Third, the number of hypotheses that cover the same positions of source words is limited to increase the diversity of hypotheses. Fourth, the proportion of CFs in the hypotheses is taken as an additional feature in the log-linear model. Fifth, word alignment information is exploited to filter the phrase table. Experimental results on Chinese-English corpus show that our method effectively took advantage of the CFs. Compared with the IMT method which only uses the prefix, our system achieves a lower KSMR score.
Keywords/Search Tags:Interactive Machine Translation, Correct Fragments, Multiple Constraints, Decoding
PDF Full Text Request
Related items