Font Size: a A A

Research On Mongolian-Chinese Neural Machine Translation Based On ULR And Meta-learning Strategy

Posted on:2022-03-02Degree:MasterType:Thesis
Country:ChinaCandidate:X ZhaoFull Text:PDF
GTID:2518306542476574Subject:Master of Engineering
Abstract/Summary:PDF Full Text Request
In recent years,with the need of communication between different regions,machine translation has achieved long-term development,and minority languages and other low-resource languages have also received more and more attention.For the needs of the development of Inner Mongolia autonomous region,the research and development of Chinese machine translation is necessary.At present,due to the influence of model architecture,the lack of Mongolian-Chinese parallel corpora and the difficulty in semantic feature extraction,there are still many shortcomings in the translation process,including too long training time,inaccurate translation,inadequate expression of semantic information and inaccurate representation of word vectors.The above problems are analyzed and studied,and the specific research work is as follows:(1)Conduct research on the lack of Mongolian-Chinese parallel corpora,so as to improve the effect of Mongolian and Chinese machine translation.With the help of Universal Lexical Representation used for word embedding in Mongolian-Chinese parallel corpora,the corpora can generate unified embedding vector space,so that words with similar semantics can have similar distribution representation,and provide similar semantic reference information for subsequent Mongolian-Chinese translation,so as to improve the translation effect.(2)At the same time of ULR processing,the problem of low quality of translation caused by out of vocabulary was studied,and word vectors were extracted by hierarchical context encoder and meta-learning algorithm to obtain accurate semantic representation to solve the out of vocabulary problem.(3)To solve the problem that the autoregressive model cannot be decoded in parallel,the encoder-decoder model based on the non-autoregressive Transformer framework and MAML algorithm is adopted.Compared with the traditional Transformer,the non-autoregressive Transformer overcomes the defect that the current generation of translation relies on the output of the previous moment.By adding the Fertility module into the encoder,the parallel output of the decoder can be achieved and the decoding speed of the translation can be improved.(4)In the optimization stage of the model,in order to improve the translation quality of the non-autoregressive translation model,the knowledge distillation method is applied to it,which can not only improve the generation rate,but also effectively improve the Mongolian-Chinese translation effect.The autoregressive model experiment,the realization of parallel output of the decoder,the current output is no longer dependent on context,but will cause loss of context semantic information,influence the final translation effect,uses the neural network structure of the turing machine to semantic information to supplement,through its external memory to realize the semantic complement and perfection,so as to alleviate the context semantic information loss caused by the translation of the decline of the effect.Based on the above research,the existing problems are analyzed and studied.The effectiveness of the above methods in improving the quality of Mongolian-Chinese machine translation is verified by comparative experiments.
Keywords/Search Tags:Mongolian-Chinese Machine Translation, Meta-Learning, ULR, NAT, NTM
PDF Full Text Request
Related items