Font Size: a A A

Research Of Comprehensive Weighted Word Semantic Similarity Computation

Posted on:2012-10-27Degree:MasterType:Thesis
Country:ChinaCandidate:Y XuFull Text:PDF
GTID:2248330362971573Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
In Chinese information processing, text similarity computation is a hotspot ofstudy and widely used in the fields of information retrieval, automaticquestion-answering system and machine translation. Currently, text similarityalgorithms are mainly based on statistics and semantic dictionary. The latter gives away to compute similarity from semantic side, and the results are close to people’sjudgments. At the mean time, it is easier to calculate than the algorithm of statistics.So, the word similarity computation studied in the paper is based on “HowNet”semantic dictionary.Grounded on traditional word similarity computation, a similarity calculationmethod is proposed with taking the depth and density of the semantic tree intoaccount, and then genetic algorithm is used to search the optimal weights.Firstly, the paper introduces current study of sentence and word similaritycomputation, and focus on the word similarity calculation method based on“HowNet”. Secondly, a sememic similarity calculation method is put forward, whichconsiders many aspects that affect word similarity results and fully uses theinformation of sememe tree, such as depth and density. Then genetic algorithm isemployed to search the optimal weights, and it avoids the unreliability andsubjectivity of the weights by experience. Finally, the proposed word similaritycomputation method in this paper is applied in the experiments of automatic testingfor subjective questions, and experimental results prove the effectiveness andprecision of this method.
Keywords/Search Tags:Similarity, Word Similarity, Weight, HowNet, Genetic Algorithm
PDF Full Text Request
Related items