Research On Chinese-to-english Machine Translation Based On Neural Network

Posted on:2021-03-11

Degree:Master

Type:Thesis

Country:China

Candidate:J P Liu

Full Text:PDF

GTID:2428330626960362

Subject:Computer Science and Technology

Abstract/Summary:

PDF Full Text Request

With the advantages of higher speed and lower cost,machine translation is considered as a promising way to overcome the barrier of communication among different languages.In recent years,with the development of deep learning,neural machine translation based on �encoder-decoder� structure has become the main research method of machine translation.However,due to the limitation of vocabulary size and the imperfections of coverage mechanism,there are problems such as out-of-vocabulary(OOV),over-translation and under-translation in neural machine translation.To address the OOV problem,we propose a data generalization method based on the �substitution-translation-restore� framework.Firstly,we determine the types of OOV words to be processed in the corpus and design algorithms to recognize and align the bilingual OOV words.Secondly,the OOV words in both training set and test sets are replaced with specific generalization symbols,and the generalized corpus is then used for model training and translation prediction.Thirdly,the OOV words are translated by methods based on dictionary or rules.Lastly,the generalization symbols in the translation produced by neural machine translation model are restored with the translation of OOV words for final translation.Experimental results show that the data generalization method can significantly enhance both the performance of neural machine translation systems and the translation accuracy of OOV words.Compared with the RNNSearch and Transformer baseline systems,the BLEU scores are increased by 4.72% and 4.21% respectively.Further experiment on Transformer system shows the translation accuracy of OOV words is increased by 35.16% on average.In order to further alleviate the over-translation and under-translation problem,we propose a multi-coverage fusion mechanism based on the consistency and complementarity of information stored in different coverage models.The translation information stored in both coverage vector and coverage score is used simultaneously to guide the attention mechanism.We first define a word-level coverage score and propose two fusion methods.Experimental results show that our multi-coverage fusion model can enhance the performance of neural machine translation,and further improve the alignment quality and alleviate over-translation and under-translation compared with other coverage models.

Keywords/Search Tags:

neural machine translation, data generalization, over-translation, under-translation, multi-coverage

PDF Full Text Request

Related items

1	Research Of Optimization Methods Integration And Translation Rerank For Mongolian-chinese Machine Translation
2	Based On The Generalization Of The Instances Of Machine Translation
3	Methods For Handling OOV In Chinese-uyghur Neural Machine Translation
4	Research And Implementation Of Uyghur-Chinese Machine Translation Based On Data Augmentation Technology
5	Research On Integrating Translation Memory Into Neural Machine Translation
6	Research On Improving Translation Diversity In Back-Translation
7	Research On Key Technology Of Post-optimization For Machine Translation
8	Multi-granularity Mongolian-chinese Neural Network Machine Translation Research
9	Research On Unsupervised Neural Machine Translation Technique
10	Multi-source Information Enhanced End-to-end Neural Machine Translation