Font Size: a A A

Tree-to-Tree Statistic Machine Translation Based On Discriminative Learning

Posted on:2015-09-30Degree:MasterType:Thesis
Country:ChinaCandidate:K SongFull Text:PDF
GTID:2348330473953635Subject:Computer software and theory
Abstract/Summary:
We study a novel architecture for syntactic machine translation based on discriminative machine learning model and dependency syntax. It’s totally different from the dominant approach in the literature and obtain a comparable BLEU and METEOR with the state-of-the-art machine translation system. It provides a new way to do machine translation task.Our system does not rely on translation rules, but treat translation as an unconstrained target sentence generation task, using soft features to capture lexical and syntactic correspondences between the source and target languages. The discriminative model used in our architecture is different from the traditional generative model, it can make use of a large set of features, which is important to help model make decisions when decoding. Target syntax features and bilingual translation features are trained consistently in a discriminative model.When the model is being trained, fi stly, we will do some previous processing work on bilingual corpus, which will recognize some particular words in bilingual sentences, then generalize and translate them. Then, we parse the bilingual corpus, as a result, each sentence in a bilingual sentence pair will have a dependency parse tree. Word alignment is applied after bilingual dependency parsing. Based on the previous work, we can get some translation rules from the bilingual parse trees which contain dependency structure, translation probability is calculated at the same time. The model will lead the decoder find a way to build that target dependency tree by using the translation rules, by this way, the translation task will be done.Experiments using the IWSLT 2010 dataset show that the system achieves BLEU comparable to the state-of-the-art syntactic SMT systems. We have got a comparable performance on GEO Query Data on a hierarchical system and a better performance on both phrase and syntax based systems.
Keywords/Search Tags:tree-to-tree, statistic machine translation, discriminative learning, word ordering, dependency syntax
Related items