Node similarity in networked information space

Posted on:2002-06-16

Degree:M.Comp.Sc

Type:Thesis

University:Dalhousie University (Canada)

Candidate:Lu, Wangzhong

Full Text:PDF

GTID:2468390011493561

Subject:Computer Science

Abstract/Summary:

Networked information spaces contain information entities, corresponding to nodes, which are connected by associations, corresponding to links in the network. Examples of networked information spaces are: the World Wide Web, where information entities are web pages, and associations are hyperlinks; the scientific literature, where information entities are articles and associations are references to other articles. Similarity between information entities in a networked information space can be defined not only based on the content of the information entities, but also based on the connectivity established by the associations present. This paper explores the definition of similarity based on comnectivity only, and proposes several algorithms for this purpose. Our metrics take advantage of the local neighborhoods of the nodes in the networked information space. Therefore, explicit availability of the networked information space is not required, as long as a query engine is available for following links and extracting the necessary local neighborhoods for similarity estimation. Two variations of similarity estimation between two nodes are described, one based on the separate local neighborhoods of the nodes, and another based on the joint local neighborhood expanded from both modes at the same time. The algorithms are implemented and evaluated on the citation graph of computer science. The immediate application of this work is in finding papers similar to a given paper in a digital library, but they are also applicable to other networked information spaces, such as the Web.

Keywords/Search Tags:

Networked information, Computer science, Similarity, Associations

Related items

1	The Interdiscipline Study Of Information Science And Computer Science
2	The identification of differentiating success factors for students in computer science and computer information systems programs of study
3	A Visualization Research About Hotspots And Cutting-edge Fields Of Information Science In America
4	Analysis Of Knowledge Flow From Library And Information Science To Computer Science Based On LDA Model
5	Research In Networked Information Resources
6	Research And Analysis On Management Information System Of Student Associations In Guizhou Urban Vocational College
7	The Study Of How IT Groups Influences Computer Science Teaching
8	Personal information management in computer science research
9	Crowdsourcing Science:Public Participate In Innovation In The Context Of Networked Communication
10	Computer programming: Science, art, or both