Research On Chinese-Mongolian Neural Machine Translation Based On AMR Semantic And Graph Neural Network

Posted on:2022-05-31

Degree:Master

Type:Thesis

Country:China

Candidate:Y Xue

Full Text:PDF

GTID:2505306542976619

Subject:Master of Engineering

Abstract/Summary:

PDF Full Text Request

With the deepening of globalization,the exchanges between nations have become more frequent,but language differences between nations have created communication barrier.The emergence of machine translation has broken this communication barrier and has gradually become a communication bridge between different countries and nations.Mongolian is a minority language spoken by a small number of people in the world,and it is widely used in the Inner Mongolia Autonomous Region and other minority areas.With the increasing exchanges between Mongolian and Chinese,the Chinese-Mongolian machine translation has gradually developed.However,due to the late start of Chinese-Mongolian machine translation and the lack of corpus resources and other issues,Chinese-Mongolian machine translation has fewer achievements compared to machine translation in other language pairs,and has relatively weaker cutting-edge research,so further studies and researches are needed.The effect of semantic representation on machine translation is obvious,mainly because it can help perform meaning preservation and process data sparseness.Semantic representation has made great progress in statistical machine translation,but the use of semantic representation for neural machine translation has received less attention.AMR,a new type of semantic representation method of sentences,which uses a single-root directed acyclic graph to abstract content words as nodes and relationships between words as edges.Basically it possesses the ability to express the semantics of a sentence more completely and accurately,which has attracted more attention.In recent years,with the continuous improvement of the AMR semantic analysis system,it can benefit to many natural language processing tasks,such as sentiment analysis,relationship extraction,text summarization,etc.However,AMR is rarely used for machine translation.Related studies have proved that AMR can indeed help improve the quality of machine translation.Therefore,this thesis studies the use of AMR semantic knowledge to assist Chinese-Mongolian neural machine translation.To construct the corpus,the Chinese part of the existing Mongolian-Chinese aligned corpus was first segmented using the word segmentation tool,and then the segmented corpus was used to generate the AMR semantic graph using the AMR editor which is developed by the University of Southern California.Because Mongolian is a low-resource language and the Chinese-Mongolian parallel corpus resources are limited,unknown words often appear in machine translation,which leads to poor translation results.In order to alleviate this problem,this article uses BPE technology.In the model construction work,since there is no one-to-one correspondence between AMR semantics and source language sentences,this paper uses a dual encoder-decoder architecture to build a Chinese-Mongolian machine translation model,and uses bidirectional LSTM encoder to encode source language sentences.in order to adapt the structural characteristics of the AMR graph,graph recurrent neural network is used to encode the AMR semantic graph generated from source language sentences,while the decoder uses recurrent neural network and establishes attention models for the two encoders respectively.At the same time,in order to compare with the model,a dual encoder-decoder model is added.In the new model,the source language sentences and the decoder model are unchanged,while another bidirectional LSTM is used on the encoder side to encode linearized AMR.Finally,experimental results prove that adding AMR semantic knowledge really helps to improve the quality of Chinese-Mongolian machine translation.Using graph recurrent neural network to encode AMR graph works better.At the same time,before training the encoder-decoder model,the word2 vec and Glo Ve models are used to generate Mongolian-Chinese word vectors.In order to further explore the effect of AMR semantics on performing semantic preservation and processing data sparseness,this article also adds an experiment,which chooses to use a dependency syntax tree to assist Chinese-Mongolian neural machine translation.Similarly,a dual encoder-decoder model is used.The difference is the graph convolutional neural network is used to encode the dependency syntax tree.The experimental results show that adding the dependency syntax tree can also help improve the quality of Chinese-Mongolian neural machine translation,but in the case of the same amount of data,the Mongolian machine translation with AMR semantic representation is better than that with the dependency syntax tree,which further illustrates that the AMR semantic representation is good at performing semantic storage and processing data sparseness.

Keywords/Search Tags:

Abstract Meaning Representation, Dependency Syntax Tree, Graph Neural Network, Double Encoder-Decoder, Chinese-Mongolian Machine Translation

PDF Full Text Request

Related items

1	Research And Implementation Of Neural Machine Translation Model Based On Fusion Of Dependency Syntactic Information
2	Research On Attention-Based Neural Machine Translation With Encoder-Decoder Architecture
3	A Sentence-level Quality Estimation For Neural Machine Translation Based On Subword Regularization
4	Research On Network Architectures For Neural Machine Translation
5	Research On Neural Machine Translation Of Academic Thesis Based On Multi-Branch Tree Neural Network
6	Predicting The Pronunciation Feature Of Poetry Based On Deep Neural Network
7	Chinese-Thai Bilingual Neural Machine Translation Method Based On Tree-structured Attention Mechanis
8	Research On Mongolian-Chinese Machine Translation Based On End To End Neural Network
9	Research And Application Of Deep Neural Networks In Generating Classical Chinese Poetry From Images
10	Research On Chinese-Thai Neural Machine Translation Method Based On Unsupervised Syntactic Structure Learnin