A Study On Tree To String Based Mongolian And Chinese Statistical Machine Translation

Posted on:2017-09-11

Degree:Master

Type:Thesis

Country:China

Candidate:J Ning

Full Text:PDF

GTID:2348330485485711

Subject:Computer Science and Technology

Abstract/Summary:

The current phrase-based statistical machine translation model is the main stream of the Mongolian and Chinese machine translation. However, despite the method is mature, phrase-based statistical machine translation has some inherent defects, such as poor generalization, poor ability of long-distance adjustment of word order, incompetence of representation of translation for the discontinuous phrases and the inconformity with the syntax of the output sentence. These deficiencies limit the further development of this method, therefore, introducing syntactic structure information into machine translation systems has become a new trend. As a translation model, syntax-based statistical machine translation has also been researched a lot. Moreover, some of the latest syntax-based statistical machine translation systems have relatively better performance compared with phrase-based system.In the implementation process of the system we also need a high accuracy Mongolian syntax parser, which is of high value in reality. Statistical syntactic based parsing has been a research focus areas in a long time. In recent years, researchers have achieved some results in the Mongolian statistical syntactic analysis. But compared with the statistical parsing research in English, Chinese and other languages, there is still a certain distance. Research on Mongolian Statistical Parsing still focus on the probabilistic context-free grammar based parsing model. Parsing research on English, Chinese and other languages proves that adding lexical information can enhance the parsing accuracy.This paper mainly do research work in three areas. At first, this paper study research related to researchers at home and abroad, and then achieve the probabilistic context-free grammar based parsing system with the open source tools Stanford Parser and realize the unlexicalized probabilistic context-free grammar based parsing system; after that, design and implement the tree to string based Mongolian and Chinese statistical machine translation system; finally, conduct experiments on Mongolian and Chinese statistical machine translation system and do the evaluation. Mongolian parsing experiments results show that precision and recall rates in the unlexiclized PCFG based parsing experiments is 0.7701 and 0.7707, which is higher than that in vanilla PCFG based experiments. Mongolian and Chinese machine translation results show that BLEU and NIST value in Tree-to-String based Mongolian and Chinese machine translation experiments is almost the same as that in phrase based machine translation experiments. This shows that if we can further improve the Mongolian parsing accuracy, accuracy of tree-to-string based Mongolian and Chinese machine translation system can be further improved.

Keywords/Search Tags:

Mongolian, Parsing, Unlexicalized, Tree to String, Machine Translation

Related items

1	Researched On Mongolian-Chinese Statistical Machine Translation Based On String To Tree Translation Model
2	Research On Mongolian Dependency Parsing Based On The Conversion Of Chinese-Mongolian Dependency Parsing Tree
3	Implementation And Analysis Of Tree To String Alignment Template Model In Statistical Machine Translation
4	Research Of Optimization Methods Integration And Translation Rerank For Mongolian-chinese Machine Translation
5	A Study On Statistical And Rule-Based Combined Mongolian-Chinese Machine Translation
6	Mongolian Lexical Analysis Research And Its Application In Statistical Machine Translation
7	Research On Some Key Technologies Of Tibetan Machine Translation Based On Tree To String
8	Research On Dependency-to-String Model For Chinese To English Example-Based Machine Translation
9	Research On Morphologically Asymmetric Chinese Mongolian Statistical Machine Translation Model Construction Methods
10	Multi-granularity Mongolian-chinese Neural Network Machine Translation Research