Font Size: a A A

Research On Translation Rule Constraint Problems In Hierarchical Phrase Based Translation Model

Posted on:2016-12-11Degree:MasterType:Thesis
Country:ChinaCandidate:H F SunFull Text:PDF
GTID:2308330461456522Subject:Computer technology
Abstract/Summary:PDF Full Text Request
In the rapid development of the Internet era, the traditional human translation has been unable to meet the huge demand for translation. The importance of machine translation is highlighted. In various methods of machine translation, statistical machine translation (SMT) becomes popular because of its high scalability and good translation performance. The hierarchical phrase-based (HPB) translation model is one of the best models in SMT, and lots of research work has been carried out in this model. Compared with phrase-based translation model, HPB model can handle non-local phrase reordering problems better. One of the most important properties of HPB is that a translation rule learned from a phrase pair can be used for any other phrase pair with the same pattern. Thus, it’s very important to select the correct translation rule in translation progress.This thesis focused on the problem of lacking of restriction of matching translation rules in HPB model. We have made a deep study on this problem and proposed a new translation rule constraint model. This proposed model introduces syntactic parsing tree, phrase boundaries and rich context information to constraint translation rule efficiently. Thus, this model can improve the translation quality of baseline system. Features from source side are used in this model and we can compute feature value in advance. As a result, it will not cost too much time in translation progress and this model can be easily added to the log-linear model in SMT. We conducted experiments on the large scale Chinese-English translation task. Experiments result show that our model made a stable improvement over baseline system.In addition, for purpose of learning translation model and the rule constraint model quickly, we used Hadoop distributed computing platform to process translation data. Experiments result that this method can drastically reduce the data processing time and improve work efficiency.
Keywords/Search Tags:Hierarchical Phrase-Based Translation Model, Rule Constraint, Parsing Tree, Phrase Boundary, Distributed Computing Platform
PDF Full Text Request
Related items