Font Size: a A A

Research On Thai-Chinese Machine Translation Optimization Method Under Low Resource Conditions

Posted on:2022-05-01Degree:MasterType:Thesis
Country:ChinaCandidate:Y H ZhangFull Text:PDF
GTID:2518306335458384Subject:Internet Technology
Abstract/Summary:PDF Full Text Request
Machine translation is an important study in natural language processing.With the advancement of computers and deep learning,neural network machine translation has surpassed statistical machine translation.At present it is highly dependent on the quantity and quality of parallel corpora.Some languages,such as Thai and Chinese,do not have sufficient parallel corpus.There is a lot of external data,monolingual corpora,dictionaries,high-resource translation models,etc.Making full use of them can alleviate the poor model effect caused by insufficient parallel corpus.In order to solve the shortage of Thai-Chinese parallel corpus,this paper studies the optimization method of low resource Thai-Chinese translation model.The main work is as follows: 1.A Thai-Chinese translation model based on parameter transfer is proposed.Parameter migration is applied to the Thai-Chinese translation model.The parameters of the high-resource Thai-English translation model and the English-Chinese translation model are transferred,and the Thai-Chinese translation model is initialized to have a parameter basis,so as to enhance the training effect of the translation model.2.This paper proposes a Thai-Chinese translation model based on the bilingual phrase list.The words in the monolingual corpus are embedded and aligned to generate a bilingual phrase list,and then a Thai-Chinese translation model is added for training.On the one hand,there is abundant semantic information in monolingual corpus,and on the other hand,there are similar features among different languages.By mapping word vectors between different languages into the same space through cross-language alignment,a bilingual phrase list can be generated that can be added to the Thai-Chinese translation model to enrich the word-to-word mapping in the model.This paper studies the application of transfer learning and monolingual corpus in Thai-Chinese translation model,and proposes a training method of Thai-Chinese translation model based on parameter transfer and bilingual phrase list.This paper aims to introduce the semantic information into the translation model of high-resource language pairs and the monolingual corpus of low-resource language pairs to improve the model.The experimental results show that both of the proposed Thai-Chinese translation models can improve the translation model.
Keywords/Search Tags:Translation Model, Monolingual corpus, Transfer learning, Crosslanguage alignment, Bilingual phrase list
PDF Full Text Request
Related items