Font Size: a A A

A Research On Machine Translation By Incorporating Structured Information

Posted on:2019-09-29Degree:MasterType:Thesis
Country:ChinaCandidate:C Q ZhaoFull Text:PDF
GTID:2428330545485293Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the rapid development of the Internet,cross-language interpersonal com-munication has become more and more frequent.Apparently,it is impossible to rely entirely on human to deal with the growing requirements of translation on the Inter-net.Machine translation is one of the powerful approaches to solve this problem.With much higher translation efficiency and lower translation cost,machine translation has been widely used in industry and is a hotspot in the field of natural language processing.The log-linear model based statistical machine translation(SMT)has brought light to the research of machine translation.However,the phrase rules and hierarchical phrase rules in classical SMT are extracted from large-scale data by statistical meth-ods,without any linguistic guidance,which leads to large amount of translation rule of various quality.At the same time,due to the lack of explicit structured information,the translation system often fails to choose the proper rules,so that the correct translations can not be obtained.The deficiency problem of structured information is also found in recent neural machine translation(NMT)models,which can directly map the source sequence to the target sequence by neural networks.The lack of structured information in the source side tends to cause the problems in the system's understanding of source language,while the lack of structural guidance makes it difficult to deal with the rela-tionship among target words.It may finally cause wrong translation,under-translation and over-translation problems,and then restricts the performance of the machine trans-lation.This thesis studies the use of structured information in machine translation models.The main works are as follows:1.In this thesis,a syntactic Treestate-based rule selection model is proposed to con-strain the use of translation rules in SMT from a huge rule set with various qual-ity.This thesis also defines the contextual feature extraction method for translation rules,which will be used to train discriminative models and estimate the proba-bilities of each rule's Treestates.And the probabilities will be add to the log-linear model of SMT to determine whether the translation rules are proper in the sentence.The results on Chinese-English translation task shows that the proposed method can effectively improve the performance of SMT.2.In order to solve the deficiency problem of structured information in NMT,this thesis presents a phrase-based neural machine translation.It incorporates phrase structure in NMT and models the correspondence between source phrases and target phrases.The results of Chinese-English machine translation experiment shows that this method can significantly improve the performance of machine translation.
Keywords/Search Tags:Statistical Machine Translation, Rule Selection, Neural Machine Translation, Structured Information, Syntax Tree
PDF Full Text Request
Related items