Font Size: a A A

The Analysis And Knowledge Base Expansion Of Hownet Machine Translation System

Posted on:2018-07-29Degree:MasterType:Thesis
Country:ChinaCandidate:L WangFull Text:PDF
GTID:2348330512973287Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
Machine Translation(MT)is a process of transforming one natural language into another automatically by computer.At present the state-of-the-art MT methods include Rule-based Machine Translation(RBMT)method,Statistical Machine Translation(SMT)method,Neural Machine Translation(NMT)method and so on.In either method,the application of semantic knowledge in translation leads to an improvement.Therefore,in recent years,semantics attracted the attention of many MT researchers.The HowNet MT system studied in this paper is a typical application of semantic knowledge in MT.HowNet MT system is a translation system based on knowledge.The knowledge base in this system includes HowNet knowledge base,axiomatic rule base and translation rule base.HowNet knowledge base serves as the language resources in machine translation,axiomatic rules lay the foundation for translation disambiguation,and translation rules control the whole translation process from logical semantics analysis to translation transformation and generation.This paper thoroughly studied and summarized the theoretical basis and the translation process of this system,and tested the system performance on patent titles,aviation corpus and China Daily corpus.The advantages and problems in HowNet MT system are analyzed.Among these problems,unknown words and candidate word ranking occupy a large proportion,and the two problems are even more severe in aviation corpus.To solve the problem of unknown words,this paper expands the HowNet knowledge base in HowNet MT system,and presents an automatic domain terminology knowledge construction method based on the headword.The method constructs term knowledge base through acquiring the term headword's DEF.Since the headwords usually have multiple senses,we expanded the axiomatic rules for HowNet sense colony disambiguation,and used term context feature expansion and translation candidate ranking to perform disambiguation.The terms are added to the HowNet knowledge base,and the translation performance improved significantly.To solve the problem of candidate word ranking,the paper expanded the special translation rule base.Through expanding special rules,the candidate word ranking performance is improved.We designed and implemented a dynamic tracking and debugging system to assist users to easily modify and add the translation rules.Using the expanded translation rule base,the result of translation is improved obviously.
Keywords/Search Tags:Semantic knowledge, HowNet MT system, Logical semantics, Term knowledge base, Translation rule
PDF Full Text Request
Related items