Font Size: a A A

The Application Of Dependency Grammar In Chinese-to-English Statistical Machine Translation

Posted on:2009-08-17Degree:MasterType:Thesis
Country:ChinaCandidate:J F JiaFull Text:PDF
GTID:2178360272989751Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Syntax-based model is becoming an active area in recent years in the statistical machine translation research. Compared to the classic Phase-based model, syntax model would be more able to integrate linguistic knowledge, and could be a better guide in translation searching, as well as translation reordering.In this paper, we focus on the dependency grammar to discover the role of syntax in the process of translation. We propose a generalized translation model, which is labeled by grammar marks, and we implement a multi-language dependency parser and two dependency structure based machine translation systems.For the syntactic parsing, we present an action sequence based deterministic parser. We apply the state of art shift-reduce algorithm, and use the statistical information based online error-correction and overall action sequence optimization to reduce the mistake caused by deterministic actions. We achieved dependence arc marker accuracy rate (LAS) 76.36% on Chinese and 82.93% on the English on the benchmark set in CoNLL2007.For the machine translation, we present two dependency structure based statistical machine translation models. Model 1 is completely lexicalized; we extract the treelet structure in the source language side, and the continuous corresponding string of words in the target language side. By combining the phrase-based template we achieved the same level results as the classic phrase-based system. Model 2 applies the generalization to summarize the learned lexical template. Different from the before systems, we apply grammar labels to constrain the generalized template. We use three kinds of variables in the form of generalization, representing three different grammar constrains. The grammar-labeled generalized templates could be more effective to guide the choice of translation. The experiment result of this model outperforms the classic phrased-based model.
Keywords/Search Tags:Dependency Grammar, Syntax Parsing, Machine Translation
PDF Full Text Request
Related items