Font Size: a A A

Research On Model Learning For Machine Translation

Posted on:2019-05-24Degree:DoctorType:Dissertation
Country:ChinaCandidate:H D ChenFull Text:PDF
GTID:1318330545475868Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
In the information era,cross-lingual activities increase dramatically,which have great-ly enlarged the demand of language translation.However,traditional human-based translation cannot satisfy this demand.Machine translation becomes the practical way to handle such demand.Among different research on machine translation,data-driven based machine translation,such as statistical machine translation(SMT)and neural machine translation(NMT)model,have gained more and more popularity because of their ability of automatically learning translation knowledge and good performance.The log-linear model is flexible to incorporate new features and shows significant advantage over the traditional source-channel models,thus becomes the state-of-the-art modeling method and is applied in various SMT settings.However,current model learning research of log-linear based SMT still faces several drawbacks.First of al-1,most of the discriminative training methods for SMT view the parameter learning problem as pairwise ranking task,decomposing the problem into pairwise comparisons neglects the global ordering of the hypotheses list may hinder learning.Secondly,mod-ern SMT systems usually use a linear combination of features to model the quality of each translation hypothesis.The linear combination assumes that all the features are in a linear relationship and constrains that each feature interacts with the rest features in an linear manner,which might limit the expressive power of the model and lead to a under-fit model on the current data.For NMT,current NMT models learn the rep-resentation of sentences based on characters or words,which makes it hard to utility structure information of sentences.In this thesis,the model learning task of machine translation is investigated in the following aspects.1.This thesis proposes a listwise learning method for the SMT tuning problem.To address the problem of discriminative parameter learning methods in SMT that they cannot make use of the global ordering information of translation lists,our framework directly models the entire translation list's ordering to learn param-eters which may better fit the given listwise samples.Furthermore,we propose top-rank enhanced loss functions,which are more sensitive to ranking errors at higher positions.Experiments show that both our listwise learning framework and top-rank enhanced listwise losses lead to significant improvements in trans-lation quality.2.This thesis proposes a nonlinear framework for SMT.It models the quality of translation hypotheses in a nonlinear manner,which allows more complex in-teraction between features.A learning framework is presented for training the non-linear models.This thesis also discuss heuristics in designing the network structure which may improve the non-linear learning performance.Experimental results show that translations produced by this framework are better.3.This thesis proposes a multi-granularity information combination method for N-MT.This thesis improves the NMT model by incorporating multiple levels of granularity.Specifically,this thesis proposes(1)an encoder with character at-tention which augments the(sub)word-level representation with character-level information;(2)a decoder with multiple attentions that enable the representa-tions from different levels of granularity to control the translation cooperatively.Experiments also show that this method significantly improve the translation per-formance.4.This thesis improves the modeling of translation quality with the natural lan-guage sentences' inherent hierarchical structure information.This thesis im-proves the NMT model by explicitly incorporating source-side syntactic trees.More specifically,this thesis proposes(1)a bidirectional tree encoder which learns both sequential and tree structured representations;(2)a tree-coverage model that lets the attention depend on the source-side syntax.Experiments ver-ify the effectiveness of these methods.
Keywords/Search Tags:Machine Translation, Data-driven, Model learning, Statistical machine translation, Neural machine translation
PDF Full Text Request
Related items