Font Size: a A A

Characterizing and mining citation graph of computer science literature

Posted on:2002-04-05Degree:M.Comp.ScType:Thesis
University:Dalhousie University (Canada)Candidate:An, YuanFull Text:PDF
GTID:2468390011493743Subject:Computer Science
Abstract/Summary:
Computer science literature, as many other natural systems behave, form a directed graph—we call it Citation Graph of Computer Science Literature, whose nodes are articles and edges are links to the articles cited in a paper. With hundreds and thousands of publications getting published each year in computer science, people are more interested in exploring the features hidden behind such huge directed graph by modern graph-theoretic techniques. In this study, we constructed a web robot querying the prominent computer science digital library ResearchIndex to build citation graphs. With the reasonable size citation graph in hand, we first verified that the in-degrees of nodes (i.e., the citations of articles) follow the Power law distribution. Next, we apply a series graph theoretic algorithms on it: Weakly Connected Component, Strongly Connected Component, Biconnected Component, Global Minimum Cut, Max-flow Min-cut and Dijkstra's Shortest Path algorithm and do numerical analyses of these results. Our study indicate that the citation graph formed by computer science literature are connected very well and its widespread connectivity doesn't depend on “hubs” and “authorities”. The experimental results also show that the macroscopic structure of the citation graph is different from the macroscopic structure of Web graph which is Bow Tie model. Also, based on the citation graph built by querying ResearchIndex which is a subset and snapshot of whole citation graph, we provide the diameter measurements.
Keywords/Search Tags:Citation graph, Computer science, Science literature
Related items