Font Size: a A A

The Research And Implementation Of Rule-based Chinese-English Machine Translation

Posted on:2014-06-07Degree:MasterType:Thesis
Country:ChinaCandidate:J TangFull Text:PDF
GTID:2268330398494103Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
When use part-of-speech tags to establish production rules, if part-of-speech tags were divided roughly, production rules can’t accurately describe syntactic features of Chinese. In this paper, classification of Chinese words was studied in detail, and production rules were established which accurately describe syntactic features of Chinese. Another research point of this paper is semantic disambiguation of verb.Usually, big Chinese part-of-speech can be divided further, and smaller part-of-speeches in big part-of-speech have different syntactic function. So, in this paper, noun, verb, adjective, quantifier, adverb, pronoun, preposition and auxiliary of Chinese were divided further, corresponding set of part-of-speech tags was established and a set of production rules was established according to the set of part-of-speech tags and syntactic features of Chinese. In the semantic disambiguation of verb, Entity Sememe Tree was introduced and a semantic repository of Chinese was established which includes semantic information of Chinese words and the subject-matching semantic category and object-matching semantic category of verb. In the semantic repository, semantic category of noun, semantic category of pronoun and subject-matching semantic category and object-matching semantic category of verb need be marked by sememe of the Entity Sememe Tree, and then semantic distance can be calculated by object-matching semantic category of verb and semantic category of object in the input sentence, and regard the meaning in the record with minimum semantic distance as the meaning of the ambiguous verb.After the treatment above, production rules can accurately describe syntactic features of Chinese. So some syntax errors can be detected and some wrong analytical results can be avoided when analyze Chinese sentence structure. In the semantic disambiguation of verb, this paper can achieve satisfactory effect by this semantic matching method.Although this paper got some progress in classification of Chinese part-of-speeches, the design of the production rules and the semantic disambiguation of verb, but there is still much room for improvement. In the classification of Chinese part-of-speeches, syntactic functions of Chinese part-of-speeches can be researched further and then adjustments to the classification of Chinese part-of-speeches can be made. In order to get the most appropriate classification of Chinese part-of-speeches, some of part-of-speeches can be merged and some of them can be divided further. In the design of the production rules, syntactic features of Chinese should be researched further, and a new set of production rules can be gotten that can describe syntactic features of Chinese more accurately. In the semantic disambiguation of verb, when calculate semantic distance, we just considered sememe distance on the Entity Sememe Tree, not considered the weights. The weights relates to semantic density of the Entity Sememe Tree.
Keywords/Search Tags:Natural Language Understanding, Machine Translation, SyntacticAnalysis, Semantic Disambiguation, HowNet
PDF Full Text Request
Related items