Font Size: a A A

Research On The Modular Chinese Sentence Similarity Computing Based On Hownet

Posted on:2011-10-23Degree:MasterType:Thesis
Country:ChinaCandidate:Z X ZhangFull Text:PDF
GTID:2178330338978194Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Sentence similarity calculation is a hotspot and difficulty that people study for a long time, In natural language processing fields, especially in the Chinese information processing, it is a fundamental and core research topic. Sentence similarity calculation has been widely used in the field of Automatic question answering system, information retrieval and information filtering system, natural language processing, intelligent retrieval, and machine translation fields etc. Sentence similarity calculation of the research status and the accuracy of the calculation results, directly decide the development of certain related fields .The paper analysis of the relationship between sememes and concepts indepthly and the components of sentences., the key research between sentence similarity calculation. Through the analysis and comparison of the sentence similarity computing method, this paper presents an improved modular Chinese sentence similarity calculation method. This method firstly identify the center predicate word of sentence, then according to center of predicate verb to chunk the sentences, subject chunk , predicate chunk , object chunk . According to the different language chunks in the expression in different sentence semantics, give different weights. Then through the calculation of each corresponding language chunks's similarity, combining corresponding chunks's Weights compute sentence similarity. This paper proves this sentence similarity calculation method of the practicability and effectiveness with experiment result. in this paper include following innovations:Firstly:This paper uses Hownet provided rich semantic information and propose a based on sememic superposed degree similarity calculation method of sememes in the researche on the basis of similarity calculation and methods This method is fully consider the same sememes in the tree structure of sememic superposed degree, the semantic distance between the sememic nodes, and the difference levels of these sememes information, as the basis of sememic similarity, in order to achieve similarity calculation result of optimization.Secondly: The paper presents a method of chinese sentence predicate center word recognition method. According to the chinese grammar of sentence and the characteristics of sentence words. This method firstly identify the center predicate word of sentence, then according to center of predicate verb to chunk the sentences, subject chunk , predicate chunk , object chunk.Thirdly: A modular chinese sentence similarity calculation is proposed based on Hownet, and gives the method of the algorithmic analyzed. It's main advantage is: this article makes full use of the words of sememic superposed degree based on similarity calculation methods, fully consider syntactic and semantic information of sentences, this method could reflect really the similarity between Chinese sentences. Experimental results show that modular chinese sentence similarity calculation is a kind of effective method.
Keywords/Search Tags:hownet, sememes similarity compution, words similarity compution, sentences similarity compution
PDF Full Text Request
Related items