Font Size: a A A

The Research And Implementation Of Subgraph Matching Algorithms On Web-scale Graph Data

Posted on:2014-03-09Degree:MasterType:Thesis
Country:ChinaCandidate:Y YangFull Text:PDF
GTID:2308330473451120Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
Graphs as important data structures in computer science have strong expression ability. They are used to model many complicated relationships in real life, such as protein-protein network, chemical molecular structure, social networks, etc. However, many problems about a graph are very difficult because they are proved to be NP-complete problems. With the rapid development of information and network technology in recent years, especially the rising popularity of the Internet, the explosion of information happens in all kinds of fields, which leads the rapid expansion of the scale of graphs. And how to find what you want from such a large-scale graph has become a practical problem in front of us which is urgent to be solved, while it is full of challenge solving the problem. So query on web-scale graph has become a research hotpot in database.Subgraph matching in this paper is defined as follows. For a web-scale graph and a query graph Q, we retrieve all subgraphs of the web-scale graph which are isomorphic to Q. The major challenge of subgraph matching is subgraph isomorphism problem which has been proved to be NP complete problem. In order to address the issue, we use the graph indexing and distributed parallel processing approaches.In the index-based subgraph matching algorithm, we overcome the shortcomings of traditional vertex-at-one-time subgrpah matching and propose an efficient path-at-a-time index-based subgrpah matching. The index maintains for each vertex of a web-scale graph a compact indexing structure comprising of decomposed shortest path information within the vertex’s vicinity. Because the index can match one path at a time, it is very simple to handle a web-scale graph.In the cloud-based subgraph matching algorithm, we adopt the distributed graph system based on memory cloud. And the system provides users with transparent interfaces, so a user can operate a web-scale graph as if it is stored in memory of a single machine. Considering a fact that factors such as a distributed environment and data update delay will aggravate the difficulty of index maintenance, so we give up constructing an index to speed up the efficiency of a subgraph matching and use a graph exploration technology to make up performance decline for the lack of an index.
Keywords/Search Tags:Web-scale graph, subgraph matching, cloud environment, Path Index
PDF Full Text Request
Related items