Font Size: a A A

Combining Discrete Lexicon Probabilities With Mongolian-chinese Neural Machine Translation

Posted on:2019-04-27Degree:MasterType:Thesis
Country:ChinaCandidate:J T LiFull Text:PDF
GTID:2428330563956748Subject:Software engineering
Abstract/Summary:PDF Full Text Request
In recent years,with the rapid development of the field of artificial intelligence,Machine Translation is also developing.The Neural Machine Translation(NMT)performs well in Mongolian-Chinese Machine Translation.Because of the small size of the corpus and complex morphological formation,neural network can't learn more linguistic features.Based on the features of Mongolian and the difficulties in the Mongolian-Chinese Machine Translation,this paper proposes to integrate the discrete word probability information of the Statistical Machine Translation(SMT)and the discrete word probability information calculated by the external dictionary to the NMT in order to improve the translation quality of the Mongolian-Chinese Machine Translation.Firstly,in view of the problem of data sparsity,morphological analysis of Mongolian complex corpus is carried out,especially the additional components of case.There are three ways to deal with the case,and different processing methods are selected according to the properties of different models.It has been proved by experiments that the quality of Mongolian-Chinese Machine Translation can be improved by choosing different Mongolian morphological analysis methods in different models.Secondly,it is difficult to translate low-frequency words accurately for NMT.On the basis of previous work,this paper has proposed a method of combining discrete lexicon probabilities with Mongolian-Chinese NMT to alleviate the poor translation of low frequency words.Finally,the problem of how to acquire and utilize external resources such as dictionaries and other resources to improve the quality of translation is discussed.This paper has collated and corrected MongolianChinese dictionary on the basis of existing resources,and contributed the dictionary information to the NMT so as to improve the quality of translation.The results of the NMT experiment showed that the BLEU of translation combining discrete lexicon probability of SMT is 34.53,the BLEU of translation of case processing is 35.72,and the BLEU of translation combining the mixed discrete lexicon probability(adding dictionary information)has also improved.It is proved that the method proposed in this paper can solve the problem of Mongolian-Chinese Machine Translation effectively.
Keywords/Search Tags:Mongolian-Chinese Machine Translation, Translation of Low Frequency Words, Discrete Lexicon Probabilities, Mongolian-Chinese Dictionary
PDF Full Text Request
Related items