Font Size: a A A

Research On Context Representation Methods For Machine Translation

Posted on:2020-04-03Degree:DoctorType:Dissertation
Country:ChinaCandidate:K H ChenFull Text:PDF
GTID:1368330590472860Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
In recent years,large-scale neural networks were used to represent natural language units as continuous space vectors to replace traditional discrete symbols,significantly improving the performance of various natural language processing tasks.Machine translation was one of the most challenging research tasks in natural language processing.Initially,the neural networks were used to improve the traditional statistical machine translation,and then sequence-to-sequence neural machine translation was proposed to model translation.The performance of machine translation has been achieved great progress.Typically,these methods of neural networks were based on the context information in the sentence to learn the source representation and the generation of target language translation.Obviously,the context information in the sentence plays a greatly important role in machine translation.Although the neural network method can capture the semantic similarity between the translation context and the predicted target word through continuous space vectors,the neural network had the characteristics of high space-time complexity so that it often focused on modeling translation context information at the word level for generating target language translations.Intuitively,a natural language sentence included not only basic word level information,but also other higher level context information,such as the local context information,the structural context information,and the sentence-level topic context information.Compared with the basic words,this context information with rich translation knowledge often led to large-scale high-order context units.In other words,direct modeling through neural networks led to more serious data sparsity problem and greater space-time complexity.To this end,this paper first explored how to use neural networks to explicitly represent higher-order context units for translation prediction and validated its effectiveness of source-side dependency context information in statistical machine translation.We were then based on the successful exploration of encoding higher-order translation context units to further research the representations of different level contexts for neural machine translation,including local context,structural context,and sentence-level context,thus improving the performance of neural machine translation.In a short,the main content of this paper includes the following four parts:1.In machine translation,higher-order context units were often used to encode more translation knowledge for translation prediction.However,they often faced a serious data sparsity problem,which made it difficult to model higher-order context information such as long-distance dependence units.Meanwhile,the traditional discrete symbol method was also difficult to capture the semantic similarity between the translation context and the predicted target word.To this end,this paper proposed a dependency-based neural network joint model.This model could not only capture the semantic similarity between the translation context and the predicted target word,but also greatly alleviate the data sparsity problem caused by large-scale higher-order context units through the convolution architecture.Moreover,it could also effectively capture long-distance dependency information in the context for translation prediction Experiments on the statistical machine translation model show that the proposed method was significantly superior to the traditional discrete symbol-based context representation method.In particular,it outperformed the well-known word-based neural network joint model by explicitly encoding the source-side long-distance dependency information.2.There were a large number of polysemous words in the natural language,that was,a word can represent multiple different meanings.In the existing neural machine translation model,no matter how many different meanings a word had,it was represented as a single real-valued vector to encode multiple meanings of the word.This means that the encoder of neural machine translation is unable to fully capture the polysemous words in the sentence,and then affected the context vector learned by the attention mechanism.It was difficult to distinguish the translations of polysemous words when predicting target words.Moreover,when the source or target sentence contained out-ofvocabulary words which were not included by source or target vocabularies,the negative effect will become more serious.To address this issue,this paper proposed a local contextaware word representation method.The method can dynamically learn a sentence-specific vector for each word to capture the meaning of the word in the current sentence,thus improving representations and translations of words,especially polysemous words and out-of-vocabulary words.3.The neural machine translation typically relied on neural networks to process source inputs sequentially for implicitly encoding syntax and semantic information,and did not explicitly take into considering the syntactic context information in the source sentence.The syntactic translation knowledge has been proven to be beneficial for translation prediction in statistical machine translation.Therefore,this paper proposed two novel methods,including source dependency representation and syntax-directed attention methods,to explicitly capture the source-side long-distance dependency context for improving translation prediction.This allowed the decoder to focus on those syntactically more relevant source words to learn a more effective translation context vector for predicting the target word.4.The neural machine translation often focused on the word-level context information to learn the context vector for translation prediction while ignoring sentence-level context information.In the natural language,a word had often different meanings in the different topics(or domains).Similarly,in the neural machine translation model,a word has often different meanings in different sentences,even there may be more than one topic in a sentence for the same word.This means that the sentence-level context information also contains the topic information of the word.Therefore,this paper proposed a sentence-level topic context representation method to represent the sentence-level context as a sequence of latent topic vectors by using a convolution neural network.These latent topic vectors were introduced into the existing neural machine translation model to improve translation prediction by the attention mechanism.Moreover,our translation model can jointly learn the sentence-level topic context and target word translation.
Keywords/Search Tags:Machine Translation, Context Representation, Attention Mechanism, Local Context, Structural Context, Topic Context, Neural Network
PDF Full Text Request
Related items