Font Size: a A A

Research On Japanese Passive And Potential Voice Of Statistical Machine Translation

Posted on:2018-11-30Degree:MasterType:Thesis
Country:ChinaCandidate:N WangFull Text:PDF
GTID:2348330512980243Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Research of machine translation has made great progress in recent years,but the quality of the translation has not yet reached common satisfaction.In the area of statistical machine translation,how to use the linguistic information such as tense and aspect is still a hot problem.Japanese is an agglutinating language.The suffixes of Japanese predicates have complex formation of different voice.Both passive and potential predicates are formed with the same suffix which originated from the same stem.This phenomenon caused mistranslation in statistical machine translation.Hierarchical phrase model is modeled and decoded by formal syntax,and it is easy to expand.However,due to the loss of some context information in the process of translation,the translation results of passive and potential sentences are not satisfactory,an approach is proposed to solve this problem,and it was verified by experiments.The innovation and contribution of this paper are as follows:(1)According to the knowledge of Japanese linguistics,the passive and potential sentences are analyzed from the perspective of Japanese to Chinese and Japanese to English.By analyzing the structural characteristics of the Japanese dependency syntax tree,voice-related features axe determined to build a voice classification model,which can distinguish the passive voice,the potential voice and other voices effectively.(2)Secondly,the problem of potential and passive rule ambiguity is analyzed and summarized in hierarchical phrase translation model.The translations of different voices can be regarded as rule selection for different voices translation,and more contextual information can be integrated during decoding.A rule feature extraction algorithm for hierarchical phrase model is proposed.(3)Aiming at the problem that the voice of each language usually keeps different syntactic structure,which caused the low translation quality,an approach is proposed by integrating voice features into hierarchical phrase based models.Bilingual features are extracted to train maximum entropy voice classification model for hierarchical phrase rules.And the voice features are integrated into log linear model for improving translation results and the accuracy of rule selection during the translation of passive and potential sentences.In both Japanese to Chinese and Japanese to English translation tasks,large scale experiments show that the proposed approach achieves better performance than baseline.It shows that our proposed method can not only improve the problem of long distance reordering but also improve translation quality of both passive and active voice test sets.
Keywords/Search Tags:Passive Voice, Potential Voice, Statistical Machine Translation, Maximum Entropy Models
PDF Full Text Request
Related items