Font Size: a A A

Research On Neural Machine Translation With Priors Of Model Structures

Posted on:2022-08-21Degree:DoctorType:Dissertation
Country:ChinaCandidate:Z X ZhengFull Text:PDF
GTID:1488306725471644Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
In recent years,due to the great success of deep learning in language translation,neural machine translation(NMT)has become the most popular machine translation paradigm in academia and industry.The mainstream neural machine translation models regard language translation as a generic sequence-to-sequence learning problem,i.e.,learning to convert source language sentences into corresponding target language sentences,and thus adopt the "encoder-decoder" framework for generic sequence-to-sequence learning problems as the modeling solution.However,in such generic modeling scheme,the neural network model design does not fully reflect the prior of human language understanding and translation(called priors of model structures in this thesis),which leads to a certain gap between the current neural machine translation models and human language translation mechanism and thus hinders the further improvement of the translation performance.Therefore,how to further exploit the priors of model structures to guide the design of neural network models is the key to narrow the gap between neural machine translation models and human language translation mechanisms.This thesis focuses on two critical priors of model structures,i.e.,hierarchy and symmetry,and aims to design neural machine translation models that are more consistent with the priors of model structures,so that they can better model translation tasks,learn language translation knowledge from data more efficiently,and thus improve the models' translation performance.The main contributions of this thesis include:1.Proposing to explicitly model dynamic translation context in neural machine translation.This study introduces the hierarchy of "context-translation"into the decoding process of neural machine translation systems.It first explicitly models translated content,the past,and untranslated content,the future,followed by predicting translation based on such dynamic and holistic translation context,thus better capturing and understanding translation context and achieving more accurate translation quality.Empirical evidence shows that this work can effectively improve translation faithfulness and accuracy.Besides,it provides a way of interpretability to understand behaviors of NMT models by visualizing context needed for prediction.2.Proposing to make the most of document context in neural machine translation.This study models the hierarchy of "local-global" in document-level text translation within a general-purpose,unified framework modeling the global document-level context and the local sentence-level context of the source and the target languages.Such unified modeling framework enables translating of any texts of arbitrary number of sentences,while tries to make full use of sentencelevel and document-level corpora for training.Empirical evidence shows that this method outperforms sentence-level NMT models and achieves improvement over existing document-level models with substantial margins.Besides,it also proves that more context leads to more accurate document-level translation than only using surrounding sentences,which to some extent denies the negative observations about document context in previous researches.3.Proposing a mirror-generative neural machine translation model.This study leverages the symmetry of translation between two languages to build up a latent variable-based,unified framework to jointly integrate a source-totarget and a target-to-source translation models,as well as language models for the source and target languages respectively,which all share the generative process from the shared semantic space.Such modeling helps the translation and language models learn from both parallel and non-parallel bilingual data more efficiently,while boosting each other in the learning process.Besides,when decoding the translation models and language models can cooperate with each other for generating better translation.Empirical evidence shows that mirror-generative NMT model can better exploit the potentials of parallel and non-parallel data in the different language pairs,domains as well as high and low-resource scenarios,thus achieving better performance.4.Proposing a new paradigm of reversible neural machine translation.This study also leverages the symmetry of translation between two languages,and proposes the reversible duplex Transformer model,whose network architecture has two ends and is fully reversible,where each end specializes a language and both ends can read and yield sentences.It simultaneously utilizes bidirectional learning signals and models a pair of translation directions that enables reversible machine translation,i.e.,the forward translation is performed by reading source language in one end and generating target language in one another,and likewise for the reverse translation.Empirical evidence shows that the proposed model outperforms unidirectional baseline regarding bidirectional translation,which proves the superiority of reversible machine translation.Besides,this study shows the first success of reversible machine translation,which starts a completely novel research direction for the machine translation community.
Keywords/Search Tags:Natural Language Processing, Neural Machine Translation, Priors of Model Structures, Hierarchy, Symmetry
PDF Full Text Request
Related items