Font Size: a A A

Research And Implementation Of Contract Translation Based On Neural Machine Translation Model

Posted on:2022-02-20Degree:MasterType:Thesis
Country:ChinaCandidate:M R SunFull Text:PDF
GTID:2518306329451164Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
The international trade between countries will become frequent under the rapid development of economic globalization,leading the translation of Chinese and English is necessary in order to understand the details of the contract.The frequent occurrence of trade has also been accompanied by an increasing number of generation of contracts,which cannot meet the huge demand of translation when the cost of manual translation is gradually rising.Thus,machine translation begins to be welcome gradually.The performance of machine translation for contract translation is not only fast,but also reduces cost of translation greatly.This paper mainly uses Transformer model to study and implement contract translation task.In order to improve the accuracy of contract translation task,this paper mainly makes the following researches:1.Construct a parallel corpus in the field of contract and expand the data by using the method of reverse translation;Due to the lack of open Chinese and English parallel corpus in the field of contract,it is necessary to crawl contract data and construct parallel corpus by crawler method.Although the parallel corpus is constructed,the data size is small and it belongs to the low resource machine translation task.In order to solve the problem of insufficient parallel corpus,a reverse translation method is adopted to construct a large number of pseudo-parallel corpus for neural machine translation training.2.A weighted summation method is proposed to improve the output of the encoder of the contract neural machine translation model;In the contract neural machine translation model,only the output of the last layer is input to the decoder,which may lose part of the semantic information of the source language,resulting in the performance degradation of the model.Aiming at the problem of insufficient output of encoder,a weighted summation method is proposed to improve the output vector of encoder.3.A method is proposed to integrate the Bert pre training model into the contract neural machine translation model;In order to obtain more syntactic and semantic information of the source language,a new method is proposed to integrate the sentence vector output from the Bert pre-training model into the contract neural machine translation model so that the model can capture the syntactic and grammatical information of the sentence and improve the performance of the model.For the Chinese-English translation task in the contract domain,using the reverse translation approach has an average of 2.3 BLEU improvement on the original Transformer model;Using the weighted summation approach has an average of 1.34 BLEU improvements over the original Transformer model;Compared with the original Transformer model,the improved model using the Bert sentence vector fusion method increases the BLEU value by 4.2 on average.Through many experiments,it has been proved that the improved method proposed in this paper can improve the accuracy of Chinese-English machine translation tasks in the field of contract,and improve the credibility of using machine translation method to translate contract text.
Keywords/Search Tags:Contract translation, Parallel corpus, machine translation, transformer model, Pre training model
PDF Full Text Request
Related items