Font Size: a A A

Wikipedia-based Semantic Comparison

Posted on:2012-06-12Degree:MasterType:Thesis
Country:ChinaCandidate:Z C ShengFull Text:PDF
GTID:2208330335497714Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
Semantic comparison is a very important branch field of natural language processing. According to the current point of view, there are many problems to be solved. This article will introduce the comparison is based on Wikipedia, respectively, using the network information page, classified information, and content of the website as a semantic background knowledge to compare semantic, to the last comprehensive comparative analysis.Web links based information (Link-Wikipedia):In every page on Wikipedia has its own chain (outlinks) and into the chain (inlinks), the Wikipedia will weave these links into a large network, by giving them all Weighted links, and finally we use approximate the shortest path algorithm to generate the shortest distance between the different pages. The distance between pages is the semantic similarity.Classified information (Cate-Wikipedia, Ecate-Wikipedia):In Wikipedia, every word belongs to a class or more classes, for each page Cate-Wikipedia generated the interpreted vector using the Wikipedia word category information. The information include:the number of page belongs to the category, number of pages contained in categories, and the depth of category in the tree. Finally, the semantics of comparison is through a interpreted vector cosine distance to achieve.Page content (ESA-Wikipedia):Wikipedia is used as a huge knowledge base, and in Wikipedia one page has only one topic, all the content in the page is to describe the topic. ESA-Wikipedia also try to generate the interpreted vector for every pages,then compares semantic similarity by computing the cosine distance between the interpreted vectors.At last, we present an application of semantic comparisons:compared with the semantic approach to tips from the hundreds of thousands of features to help users find the feature that users want. The whole system is mentioned in this article the use of the algorithm, not only to prove its feasibility, but also demonstrated the importance of semantic comparison.
Keywords/Search Tags:Wikipedia, Semantic comparison, Page network, Category network, Inpresing Vector
PDF Full Text Request
Related items