Research On Chinese-Korean Neural Machine Translation Method Based On Transfer Learning

Posted on:2022-07-27

Degree:Master

Type:Thesis

Country:China

Candidate:Q Wang

Full Text:PDF

GTID:2518306338956139

Subject:Computer application technology

Abstract/Summary:

PDF Full Text Request

Translation is an important requirement for the exchange of human thoughts.Intelligent translation technology accelerates the integration of different civilizations and promotes the development of human society.Deep learning technology has been successfully applied in the field of modern machine translation,and has achieved good translation effects in many language translation tasks.However,the neural machine translation model is limited by the scale of data volume,the translation effect is not satisfactory for the language pairs with small data and low resources.In this dissertation,we proposed a transfer learning-based neural machine translation method for Chinese and Korean bilingual parallel corpus to improve the translation performance.The main research work of this dissertation is summarized as follows.Firstly,we studied the automatic alignment of Chinese and Korean sentences,and proposed a sentence alignment algorithm combining Sino-Korean words in Korean text,then split the corpus into sentences and aligned the sentences of the corpus according to the probability and dynamic programming algorithm.Secondly,a Chinese-Korean neural machine translation method using weight sharing was proposed to train the parent model under the encoder-decoder framework,and then we passed the network weights of the parent model to the child model,integrated the vocabularies of the parent and child models,represented the word vectors of the child model with a common vocabulary,and finally trained the child model until convergence.Finally,a combination of pre-trained language models was proposed.The BERT network structure was used as the encoder part of the machine translation model,and the Transformer model was initialized by BERT.The wordpiece byte encoding was used to divide the Chinese-Korean parallel corpus,and the corpus was cut into the form of subwords,which reduced the influence of unregistered words.The method proposed in this dissertation solved the problem of unregistered words and long sentence processing,and performed well in semantic fluency.The BLEU value of the weight-sharing-based Chinese-Korean neural machine translation model studied in this dissertation is 15.36,which is 2.68 higher than the BLEU value of the baseline model,and the BLEU value of the proposed translation model combined with the pre-trained model is 31.61,which is 1.74 higher than the BLEU value of the baseline model.It proves that the Chinese-Korean neural translation model proposed in this dissertation can effectively translate Chinese text to Korean text in the case of insufficient bilingual parallel corpus.

Keywords/Search Tags:

Chinese-Korean neural machine translation, transfer learning, weight sharing, pre-trained language model, Chinese-Korean sentence alignment

PDF Full Text Request

Related items

1	Korean Characteristics, Newspapers And Language
2	The Effect Of Korean Popular Dramas For Korean And Chinese University Students
3	Research On Chinese-Mongolian Neural Machine Translation Based On Monolingual Corpora
4	Research And Application Of Uyghur-chinese Machine Translation Model Based On Deep Learning
5	Applied Research Of Chinese-Korean Cross-Language Text Similarity Calculation
6	A Comparative Study Of News Headline Between Chinese And Korean
7	Research On Predicate Conversion For Chinese-Korea Machine Translation
8	Research On The Effect Of Chinese Viewers' Watching Motives Of South Korean TV Drama On Satisfaction And Loyalty
9	Research On Deep Learning Based Bilingual Long Sentence Segmentation Method
10	Research On Korean Spoken Language Identification