Font Size: a A A

Research On Improving Machine Translation With Chinese Function Word Usages

Posted on:2019-10-02Degree:MasterType:Thesis
Country:ChinaCandidate:H F XuFull Text:PDF
GTID:2428330545453842Subject:Computer technology
Abstract/Summary:PDF Full Text Request
People from different regions communicate with each other more and more frequently under the background of globalization,and translation helps the public to overcome the language barrier.While an excellent translator acquires a wide range of knowledge including culture,tradition to generate good translations,which increases the cost of human translation,and proposes the requirement to translate with computer.As a result,many machine translation methods including RBMT(Rule-Based Machine Translation),SMT(Statistical Machine Translation)and NMT(Neural Machine Translation)have been proposed.With the accumulation of data and increasing computing resource available,the translation quality of end-to-end deep learning approaches is closer with that of human translation,but there is still a significant gap in translation quality between Chinese-English translation tasks and translation tasks within Indo-European language family.We suppose that one cause of this gap is that the grammatical meaning determined by morphology or syntax in the target languages is normally expressed by function words or word order in Chinese.This thesis researches on the automatic recognition of some common Chinese function words' usages based on Chinese Function Word Knowledge Base(CFKB)and methods to incorporate these usages into neural machine translation.The main contributions of this dissertation are as follows:(1)Find an empirical CRF(Conditional Random Field)template for the detection of Chinese function word usages and propose a neural model which utilizes Bidirectional GRU(Gated Recurrent Unit)to automatically extract features from each side of sentences while recognizing the usages of function words.Experiments show that Bidirectional GRU improves F1 score in the recognition of some function words' usages on top of CRF.(2)Integrate automatically recognized Chinese function words' usages into neural machine translation by dividing the Chinese function word embedding into word embedding and usage embedding,replacing the function word with specific usage and constructing representation with word embedding and usage embedding.Incorporating the usages of "De" by dividing the embedding improves the average BLEU(+ 0.67),and replacing the function word usages of "De" reduces SAER(Soft Alignment Error Rate)(-1.42).(3)Seek a solution for the online deployment of the neural machine system and develop an online machine translation server.
Keywords/Search Tags:Deep learning, Chinese function word usages, Sequence labeling, Recurrent neural network, Neural Machine Translation
PDF Full Text Request
Related items