Font Size: a A A

Mongolian-Chinese Neural Machine Translation Based On The Fea-tures Of Statistical Machine Translation

Posted on:2018-07-16Degree:MasterType:Thesis
Country:ChinaCandidate:J DuFull Text:PDF
GTID:2348330515452367Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the development of machine translation,statistical machine translation is difficult to improve causing it has stepped into the bottleneck.As a result,the re-searchers gradually set their sights on the neural machine translation.Neural machine translation has recently achieved promising results with the big scale corpus.But there is little research on the small scale corpus,such as Mongolian.However,as a newly emerged method of machine translation,it has some limitations.(1)In order to reduce the complexity of training,neural machine translation usually limits the vo-cabulary into a certain size,thus it causes a very serious out-of-vocabulary words which hardly destroy the translation.(2)The decoder of neural machine translation lacks a mechanism to guarantee all the source words to be translated and causing the short translation.(3)It cannot take good use of language model.According to reasons above,we build a neural machine translation system on Mongolian-Chinese parallel corpus.And we put forward a method that combining the features of statistical machine translation with the neural machine translation to easy its problems.Firstly,we built an attention-based neural machine translation on Mon-golian-Chinese.Secondly,we extracted some features of statistical machine transla-tion like translation model,word rewarding and language model.We also defined the function these features.Thirdly,we established the Mongolian and Chinese allegation dictionary with the Mongolian-Chinese parallel corpus and GIZA++.As well as,we used the IRSTLM to build the language model.Fourthly,we used the log-linear mod-el to add the allegation dictionary,word rewarding and language model into the de-coder of attention-based neural machine translation to its limitation which are men-tioned above.According to the problem of out-of-vocabulary words in neural ma-chine translation we finally put forward two methods of processing during translating and post-processing which are effectively reduce the number of out-of-vocabulary words in neural machine translation.The results of experiments show that the approach adding the features of statisti-cal machine translation into Mongolian-Chinese deep neural machine translation can significantly improve the translation performance with the highest BLEU score of 30.66.The lengths of outputs on test set are raised from 16.7 to 19.1.It can also han-dle the 86%out-of-vocabulary words in neural machine translation.
Keywords/Search Tags:Mongolian-Chinese machine translation, neural machine translation, features of statistical machine translation, out-of-vocabulary
PDF Full Text Request
Related items