Font Size: a A A

Node similarity in networked information space

Posted on:2002-06-16Degree:M.Comp.ScType:Thesis
University:Dalhousie University (Canada)Candidate:Lu, WangzhongFull Text:PDF
GTID:2468390011493561Subject:Computer Science
Abstract/Summary:
Networked information spaces contain information entities, corresponding to nodes, which are connected by associations, corresponding to links in the network. Examples of networked information spaces are: the World Wide Web, where information entities are web pages, and associations are hyperlinks; the scientific literature, where information entities are articles and associations are references to other articles. Similarity between information entities in a networked information space can be defined not only based on the content of the information entities, but also based on the connectivity established by the associations present. This paper explores the definition of similarity based on comnectivity only, and proposes several algorithms for this purpose. Our metrics take advantage of the local neighborhoods of the nodes in the networked information space. Therefore, explicit availability of the networked information space is not required, as long as a query engine is available for following links and extracting the necessary local neighborhoods for similarity estimation. Two variations of similarity estimation between two nodes are described, one based on the separate local neighborhoods of the nodes, and another based on the joint local neighborhood expanded from both modes at the same time. The algorithms are implemented and evaluated on the citation graph of computer science. The immediate application of this work is in finding papers similar to a given paper in a digital library, but they are also applicable to other networked information spaces, such as the Web.
Keywords/Search Tags:Networked information, Computer science, Similarity, Associations
Related items