Font Size: a A A

Research Of Statistical Translation Models Based On Lexical And Phrasal Semantics

Posted on:2018-06-21Degree:MasterType:Thesis
Country:ChinaCandidate:H Q TangFull Text:PDF
GTID:2348330542465281Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Phrase-based statistical machine translation(SMT)can deal with semantic dependencies within phrases well,but performs rather poorly in capturing long-distance dependencies for the reason that it adopts phrases as its translation units.At the same time,any consecutive words can constitute a phrase which means phrases not necessarily contain semantic structures.In this case,little semantic information can be used in a phrase-based translation model.Considering that phrase-based SMT has these obvious shortcomings,this paper focus on the research of lexical and phrasal semantics based SMT which introduces semantic information into phrase-based SMT in order to improve the translation system performance.Our work contains two parts:(1)As phrase-based SMT cannot capture long-distance semantic constraints,long-distance verbs and their objects or subjects are often incorrectly translated.In order to address this issue,this paper conducts the study of incorporating lexical semantics into SMT.We propose a verbal selectional preference based translation model to integrate preferences that verbs impose on subjects and objects into translation.First,we extract all verb-object and verb-subject selectional preference instances from training corpus.Then,we apply a conditional probability estimation method and a topic model to compute monolingual and bilingual selectional preferences for verbs under verb-object and verb-subject semantic relationship respectively.At last,we integrate verbal selectional preferences into a state-of-the-art phrase-based SMT system.Our experiments display that the proposed translation model can deal with errors in translations between verbs and their arguments.(2)As phrase-based SMT uses rather limited semantic knowledge,which produces lexical translation errors in selecting translations for source words with multiple senses,this paper carries out the research of exploring phrasal semantics for SMT.We propose a supersense based translation model which firstly incorproates coarse-grained word senses into translation.At the beginning,we adopt a supersense tagging method to annotate source words with supersenses.Then,we employ two methods to train our supersense-based translation model: a maximum entropy based model and a sense embedding model.Finally,we design corresponding algorithms to integrate the two models into a state-of-the-art phrase-based SMT system.Experiment results indicate that our supersense-based translation models can significantly improve translation quality via alleviating source word sense translation errors.
Keywords/Search Tags:Semantic Constraints, Selectional Preferences, Semantic Knowledge, Supersenses, Sense Embeddings
PDF Full Text Request
Related items