Font Size: a A A

Research On Graph Aggregation Algorithm Based On Conditional Entropy

Posted on:2015-09-27Degree:MasterType:Thesis
Country:ChinaCandidate:Q P PanFull Text:PDF
GTID:2208330431976723Subject:Computer technology
Abstract/Summary:PDF Full Text Request
Graph summarization is to obtain a concise super-graph covering the most information of the underlying input graph, and it is used to extract summarization, solve storage consumption and protect privacy in social networks. Super-graph generally consists of two parts:super-node and super-edge. Super-node contains the vertex set which has the similar structure or attributes, and super-edge is the edge set which exists between the super-nodes. This paper mainly researches the graph summarization with the similar structure. Simultaneously, we obtain the existing probability of the edges between the super-nodes by identifying the weight of the super-edge. Unlike the previous algorithms that only consider the consistency of the single grouping, This paper introduces the conditional entropy to measure the influence of the single grouping on the whole super-graph. What’s more, we relax the restrictions on the input parameters, and allow the user to decide the threshold value of the error between the original graph and the super-graph, then adjust the precision of the super-graph.Based on conditional entropy, this paper proposes two constructing algorithm of the super-graph:Top-Down graph Summarization TDS and Bottom-Up graph Summarization BUS. TDS top-down divides the original graph, and it enumerates all candidate set of the grouping, then gets the succinct super-graph which meets the threshold value of the error and has the minimum conditional entropy. Since the index time complexity of TDS, and the lack of practicality in the big graph, we propose BUS with polynomial time complexity. It adopts the greedy strategy to agglomerate local optimal grouping. To further optimize the algorithm, we user the structural similarity of vertexes in one grouping to filtrate the candidate set of the grouping, then to decrease the time and space consumption.This paper studies the queries of the super-graph which is build up by the proposed algorithms:adjacency query, degree query and clustering coefficient query. In the super-graph, adjacency query get the adjacency probability of two vertexes by the weight of super-edge between super-nodes including these vertexes; degree query set the mean degree of the super-node as the degree of the vertex which is included by this super-node; clustering coefficient query compute the clustering coefficient of each vertex by using the adjacency query and degree query.Finally, experimental results confirm that our approaches is high-efficiency and practical.
Keywords/Search Tags:graph summarization, conditional entropy, error rate, graph query
PDF Full Text Request
Related items