Font Size: a A A

Phrase Semantic Model Rules For Machine Translation Research

Posted on:2003-09-13Degree:MasterType:Thesis
Country:ChinaCandidate:X L ZhengFull Text:PDF
GTID:2208360092471226Subject:Computer applications
Abstract/Summary:PDF Full Text Request
Phrase plays an important role in Chinese. The investigation in the construction of Chinese phrase can not only provide effectual guidance for phrase parsing,raise the accuracy of the syntax and semantic parsing,but also facilitate the research of Chinese word and sentence,improve the quality of Chinese text parsing.In this paper,the representation,acquirement and application of the construction of Chinese phrase were investigated systematically. HowNet is used as main semantic resource. Based on thorough study of the Knowledge Dictionary Mark-up Language (KDML) of HowNet and the construction of Chinese phrase,a presentation of semantic pattern rules was designed to formalize the construction. Upon this foundation,a corpus-based algorithm was designed and implemented to acquire and filter binary semantic pattern rules automatically. In the algorithm,a data mining method for cross-level association rules is adopted,which is guided by metarule,to find the semantic laws of word combinations in Chinese phrase corpus. Then statistic results are used to filter the findings. In the end,the remains are transformed into binary semantic pattern rules. In order to enhance their abilities of word sense disambiguation and structure disambiguation,these rules were optimized and expanded according to rules of word combinations concluded artificially,and a set of semantic pattern rules was acquired. Combining the feature of the Chinese-English machine translation system XMMT. a semantic based disambiguation algorithm was designed and implemented. With the algorithm,word sense disambiguation and structure disambiguation can be done by semantic pattern rules matching during syntax parsing.The experiment result indicates that:(a) The presentation of semantic pattern rules can formalize the construction of Chinese phrase quite well;(b) The corpus-based algorithm for acquiring and filtering binary semantic pattern rules is effective,and it can reduce the human labor,avoid subjectivity and unilateralism caused by writing rules manually;(c) The semantic based disambiguation algorithm can achieve satisfactory effects.
Keywords/Search Tags:natural language processing, machine translation, semantic analysis, semantic rule, association rule, data mining, pattern matching, disambiguation, corpus, HowNet
PDF Full Text Request
Related items