Font Size: a A A

Research And Implementation Of Efficient Dense Subgraph Query Algorithm

Posted on:2013-08-15Degree:MasterType:Thesis
Country:ChinaCandidate:J LiFull Text:PDF
GTID:2298330467478770Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Graph is a flexible data structure, it comes to relational which is expressed in the form of graph. Graph haves many applications in the entity, soft and social network. However, the big graph is difficult to be understood and applied. People tend to like a certain node and connection information, while other parts are not relevant. Digging a connectivity subgraph is an application from the big graph. For example, consider the biological graph of one million nodes and eight million edges while generating a given several diseases, dig out a small dense subgraph, requiring as much as possible the connectivity of the various diseases, and to ensure a direct connection dense. With dozens of nodes dense subgraph gives a good illustration on the direct relationship of these types of diseases. However, several diseases connected relationship, is not clearly given by the original biological graph.The article focuses on the dense subgraph query algorithm. Firstly, give the definition of the problem:Given some of the query node and the data graph, find a connectivity subgraph that contains the query node from the graph and meet the score function largest. The evaluation function is defined mainly consider two factors:one is subgraph correlation, and the other is subgraph density. Secondly, according to the characteristics those Random walks which calculate the relevance need sync iteration, a quick calculation of the correlation of asynchronous update the BFS correlation method is proposed. Thildly, a Fastsubgraph discovered dense subgraph algorithm is proposed. The specific idea of the Fastsubgraph is to first look for a subgraph set which contains the query node, then the dynamic combination of the subgraph set, the score function screening score maximum candidate subgraph, and finally expand the subgraph which not meet the node number budget, to get the finally query subgraph.At last, There is also improved that CEPS and the SISP algorithm, the purpose is that two algorithms can find a dense subgraph. The main idea of the CEPSPlus algorithm is to first find a connected subgraph of the query point, and then add the node until the subgraph reaches the limit of number of nodes and to meet the score function largest. For SISPplus algorithm, the particle initialization phase has been improved to make it faster, modify the fitness formula is to calculate the particle; According to dense information characteristics, the update stage of the algorithm has been improved so that it can found the dense subgraph.In the experimental stage, Fastsubgraph, CEPSplus, SISPplus algorithm are compared in the random walk of the correlation and breadth-fist traversal of the correlation of performance and parameter information. Indirect proof of the breadth-first traversal of the correctness of the correlation proposed in this paper.User Review found that Fastsubgraph, CEPSplus, SISPplus algorithm can find a dense subgraph, but found Fastsubgraph dense subgraph better.
Keywords/Search Tags:dense, Non-monotonic evaluation functions, correlation, dynamic combination
PDF Full Text Request
Related items