| Communication is the lubricant of human social life.Good communication can improve work efficiency and promote the harmonious development of society.And the premise that we can communicate well is that we understand each other’s language.China is a country with many nationalities and languages.Many times,people who speak two different languages and have never learned the other language have difficulty communicating directly.In order to promote the mutual understanding among peoples,it is necessary for us to study the translation between peoples’ languages.It will also protect and promote the traditional cultures of ethnic minorities and help realize the Chinese dream of great rejuvenation at an early date.The Yi nationality is a minority group distributed in the southwest of my country,and the Yi language is also one of the ancient scripts in my country.In order to solve the problems of communication difficulties in the process of ethnic integration in the Yi area,promote the economic and cultural development of the Yi area.At the same time,in order to better promote the excellent traditional culture of the Yi people.With the rapid development of deep learning technology in recent years,this dissertation carried out research on neural machine translation related to Yi language,and realized neural machine translation from Yi to Chinese.The main work of this thesis is divided into the following three parts:(1)In order to complete the Yi-Chinese neural machine translation task,this thesis systematically learned the neural machine translation related technology.Given that there is no relevant Yi language corpus.This dissertation collects and organizes Yi materials into 200,000 monolingual corpora of Yi language,and a corpus of 70,000 pairs of Yi-Chinese translations based on words and ancient poems in Yi language.Then use the collected Yi vocabulary to count the word frequency on the Yi monolingual corpus to make a word frequency table,extract the Yi vocabulary to construct a labeled pseudo Yi sentence according to the frequency weight,train the bidirectional LSTM model to predict the sentence label,and finally combine the Viterbi algorithm Output the optimal segmentation scheme to realize the Yi word segmentation based on deep learning.(2)According to the research on neural machine translation of small languages in recent years,this paper proposes a Yi-Chinese translation model based on Transformer XL’s dual encoder and dual decoder.The encoder and decoder of the translation model have a pair from Transformer XL,the other sub-encoder is a bidirectional LSTM,and the sub-decoder is an LSTM combined with an attention mechanism.At the same time,in order to better capture the word order information,the encoder side introduces a word embedding method based on complex values.In view of the fact that there is no parallel corpus between Yi and Chinese,but there is a partial translation of the corpus between Yi and Chinese based on words and ancient poems in Yi language,this dissertation uses the latter to adopt a weakly supervised learning method to initialize the translation model.In order to verify the effectiveness of the model proposed in this thesis,the word embedding method introduced and the model training method used,and in order to compare with statistical machine translation,this dissertation did four sets of comparative experiments.(3)Realize the translation system from Yi to Chinese.Combined with the translation comparison experiment,the translation model combined with complex-valued word embedding method and weakly supervised learning method proposed in this dissertation is selected to implement a translation system based on B/S architecture.Introduced the overall system architecture and functional modules,implementation process and system deployment in detail,and carried out system stability test and translation effect display. |