Font Size: a A A

Research On Japanese Tense Translation In Hierarchical Phrase-based Translation Model

Posted on:2018-04-25Degree:MasterType:Thesis
Country:ChinaCandidate:F MingFull Text:PDF
GTID:2348330512480243Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
As one of the most challenging field in natural language processing,Machine translation has wide application value and important commercial application prospect.The rapid development and popularization of network technology has led to an increasing of information and a large demand for machine translation in various fields.Nowadays,research on statistical machine translation(SMT)has achieved a great deal,but how to integrate linguistic knowledge into SMT effectively is still a hotpot in related applications.Machine translation is an important part of text processing.The tense relations often involved in text processing are important for text reasoning and screening,while tense plays an indispensable role.At present,the study of tense is mainly confined to temporal recognition,and the research on temporal translation in statistical machine translation system is less.Tense information is important linguistic information of the text,so the tense problem studied in this paper is transformed into the problem of integrating linguistic knowledge into statistical machine translation model.In this paper,the main object of tense study is Japanese,involving Japanese-Chinese and Japanese-English tense translation.Japanese belongs to a agglutinative language,and its tense is determined by the predicate verb deformation,and the variation of the predicate suffix is varied.There are similar suffixes in different tense expressions,leading to a low accuracy of tense translation in SMT system.Aiming at the above problems,this paper presents a SMT method,which integrates tense features.The main contributions are concludeded bleow:(1)A tense classification method based on Japanese dependency structure is proposed.This method analyzes the results of Japanese dependency parsing and combines the tense characteristics of the target language to extract the related tense information and construct the maximum entropy model.The model can effectively identify the tense,and its classification accuracy shows the effectiveness of the classification method.(2)A tense feature extraction algorithm for hierarchical phrase-based model is proposed.The algorithm extracts the tense features of the rules that meet certain conditions by introducing the marked syntactic structure information,while extract the translation rules.(3)A method of integrating tense features in statistical machine translation is proposed.This method integrates the tense feature into the translation model by the logarithmic linear model,which increases the linguistic constraints of tenses to guide the selection of the decoder without increasing the complexity of the decoder.And there is no dependence on the language,only according to the different language grammar,choose to blend monolingual tense features or bilingual tense characteristics.The results of translation experiments show that this method can improve the quality of translation and improve the problem of temporal translation.Experimental results proved that our method can not only improve the accuracy of tense translation,but also achieve the purpose of word sense disambiguation and improve the reordering of the sentence structures.
Keywords/Search Tags:Statistical Machine Translation, Tense, Hierarchical Phrase-based Model, Maximum Entropy Model
PDF Full Text Request
Related items