Font Size: a A A

Two Direction Machine Translation Based On Sentence Semantic Embedding And Its Evaluation

Posted on:2021-05-16Degree:MasterType:Thesis
Country:ChinaCandidate:Z L JinFull Text:PDF
GTID:2428330611499994Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
The definition of machine translation is to translate the writing form of one natural language into the writing form of another natural language through the calculation model.In recent years,due to the gradual enhancement of the calculation ability of neural network,machine translation can be carried out by using neural network.Words can be mapped into vector space with higher dimensions,and the source language vector space can be mapped to the target language vector space by using large-scale neural network,Neural network translation system has surpassed the traditional statistical method in many tasks.Although the results have been greatly improved,the overall system is still subject to the scale of training data,the demand for parallel corpus is large and parallel corpus is difficult to label.Therefore,the neural network machine translation model based on semantic vector proposed in this paper can make use of the existing parallel corpus without additional parallel corpus,and build the semantic vector on the target language and the source language by combining the data set easily obtained by resources,thus improving the performance of machine translation.In addition,because of the huge parameter scale of deep learning network,many operations and expressions are difficult to explain in the training process,including the reasons and ways of dynamic changes of parameters in the internal training of neural network.In the analysis of experimental results,we use a new perspective to analyze the training process.The first part of this paper is to construct high quality semantic sentence level embedding.We compare the effect of different structures on sentence coding,and use the best structure to code sentences.In addition,the cross language experiment is carried out on the task of natural language understanding.As the baseline model of semantic vector,the performance of the encoder is further improved through multi task learning,and the performance of the semantic vector obtained by the final cross language unsupervised learning is close to the semantic vector of the source language obtained by the supervised learning.The second part of this paper combines the constructed semantic vector into the machine translation model.As we know,this is the first time to apply the semantic vector obtained from the easily acquired data training to the machine translation task.In the process of machine translation,in order to make full use of the semantic vector of the source language end and the target language end,we will train the two-way machine translation system at the same time That is,training from the source language to the target language and from the target language to the source language at the same time.Using the method proposed in this chapter,compared with other transformer baseline models with strong performance,it has a significant improvement in wmt14 English and French data sets.This chapter gives a way to assign the changes of loss to all parameters from the perspective of loss,so as to see whether the contribution of parameters to loss reduction is positive or negative.Then,the LCA method is used to test the model in the previous chapter.Through the evaluation on the standard data set,it is proved that the new parameters of bi-directional translation model combined with semantic vector proposed in Chapter 3 can improve the Bleu value of translation results from the experimental results in Chapter 3.From the point of view of this chapter,it can play a positive role in reducing the loss,which is helpful for training.At the same time,we compare several different LCA calculation methods and get the same conclusion.
Keywords/Search Tags:Machine translation, deep learning, multi-task learning, sentence embedding, LCA
PDF Full Text Request
Related items