Font Size: a A A

Research On Mongolian–Chinese Machine Translation Based On LSTM Neural Network

Posted on:2019-07-29Degree:MasterType:Thesis
Country:ChinaCandidate:W W LiuFull Text:PDF
GTID:2428330563997706Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
With the increasing development of information technology and language communication,Machine Translation has gradually become the main way to disseminate information among different languages,and the quality of the method also affects the quality of translation.In Mongolian–Chinese Machine Translation,due to difficulties in word recognition,large difference in word order and complex word formation,the semantic expression by traditional Machine Translation methods are not ideal enough,and the quality of translation is restricted.Neural Machine Translation model based on the LSTM has gradually emerged in Machine Translation with its unique encoding and decoding structure and semantic mining characteristics.However,there are few researches on Mongolian–Chinese Machine Translation combined with LSTM at present.Therefore,this paper mainly studies the construction and optimization of LSTM model with Mongolian-Chinese bilingual corpus pretreatment and Mongolian morpheme encoding.In the process of corpus preprocessing,considering the problem that words are not matched well in traditional Mongolian-Chinese Machine Translation,this paper presents a GRU-CRF hybrid algorithm to construct a segmentation module.The algorithm annotates labeling sequence semantic by combining GRU with CRF,in order to realize word segmentation complying with the semantic relations and to overcome the problem that context in HMM and CRF segmentation model is not adequately considered.Meanwhile,the distributed representation is used for victimization of the segmented words.To learn more grammatical and semantic knowledge from Mongolian language materials at the stage of model construction,this paper proposes a LSTM model based on morpheme coding to construct the encoder and a LSTM decoder to decode and predict the Chinese language.To do researches and make comparison of different length sentences based on the construction model,it shows that the translation quality is improved for handling long-time dependence in this model.In order to further improve the translation accuracy,this paper presents a multi-granularity local attention mechanism to optimize the model,by using LDA algorithm to descend word vector feature dimension and fuse Mongolian word and morpheme information to improve bilingual word alignment accuracy,in order to strengthen the LSTM translation model predictive ability for translated text.Finally,to verify the performance and feasibility of the LSTM Machine Translation optimization model with multi granularity local attention,we compare the optimization model with the statistical Machine Translation model and the RNN benchmark model.Taking BLEU value as translation evaluation standard,the translation quality of the optimization model is improved compared with the benchmark system and statistical translation system through the analysis on the experimental comparison results.
Keywords/Search Tags:Mongolian–Chinese Machine Translation, GRU-CRF Algorithm, LSTM Neural Network, Local Attention Model
PDF Full Text Request
Related items