Research On Document-Level Neural Machine Translation

Posted on:2021-03-19

Degree:Master

Type:Thesis

Country:China

Candidate:P Zhang

Full Text:PDF

GTID:2428330605974882

Subject:Computer technology

Abstract/Summary:

PDF Full Text Request

At present,Neural Machine Translation(NMT)is the main research direction of machine translation.The research work of NMT usually takes sentence-level translation as the research object.For the process of translation,a single sentence is often viewed as an independent individual,thus ignoring the context information of the sentence in the document.In order to make use of the document-level information to generate more suitable translation,so that the translation can maintain the consistency of translation style and the accuracy of translation in the whole document or specific semantic environment,we propose three methods in this paper.The main contents include(1)Context Recovery for Document-Level Neural Machine Translation.For sentence-level NMT,there is always a problem of incomplete semantic representation since the context information of the current sentence is not considered.We get document-level information from each sentences by dependency parsing,and then complement the information into the source sentences,making the semantic representation of the sentences more complete.We conduct experiments on Chinese-English language pair,and propose a training method on large-scale parallel language pairs for the scarcity of document-level parallel corpus.(2)Learning Contextualized Sentence Representations for Document-Level Neural Machine Translation.In order to capture inter-sentential dependencies in the document,the information of context sentences is often integrated into the current sentence for document-level NMT,so that the current sentence contains context information.In this paper,we propose a document-level framework to model cross-sentence dependencies by training NMT model to predict both the target translation and surrounding sentences of a source sentence.By enforcing the NMT model to predict source context,we want the model to learn "contextualized" source sentence representations that capture document-level dependencies on the source side.(3)Fusing Context-Aware Sentence Representations for Document-Level Neural Machine Translation.When translating documents,the NMT system adopts sentence by sentence translation method without considering the representations of sentences in the document.In this paper,a document-level NMT model is proposed,which uses an additional context sentence encoder to learn the context-aware sentence representations of source sentence to the other sentences in the document,and then integrates this sentence representation into the encoder and decoder.Compared with method(2),this method can use more source sentences and target sentences information.

Keywords/Search Tags:

Neural Machine Translation, Document, Context Recovery, Contextualized Sentence Representations

PDF Full Text Request

Related items

1	Research On The Method Of Paragraph-level Neural Machine Translation Integrating Paragraph Informatio
2	Document Level Machine Translation Of Classical Chinese With Adjacent Clauses Based On Guwen-UnilM
3	Sentence-Level Machine Translation Quality Estimation Based On Neural Network Features
4	Research On Context Representation Methods For Machine Translation
5	Research On Improving Document-Level Neural Machine Translation
6	Research On Document-Level Neural Machine Translation
7	Research On Mongolian And Chinese Neural Machine Translation Based On Grammar Supervision And Deep Reinforcement Learning
8	Research On The Key Techniques For Neural Machine Translation
9	Research And Application Of Neural Network-based Sentence Alignment
10	Towards Robust Neural Machine Translation With ASR Errors