Font Size: a A A

Research On Top-K Interesting Subgraph Query For Large-Scale Dynamic Labeled Graphs

Posted on:2020-09-28Degree:MasterType:Thesis
Country:ChinaCandidate:C J JiaFull Text:PDF
GTID:2370330578950931Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
A labeled graph is a special graph structure with node identification capability,which has been widely used in modeling of geo-social networks,electronic commerce networks,biological information networks and other fields.With the rapid development of science and computer technology,the labeled graphs abstracted from the above fields not only have the characteristics of traditional graphs,but also present the characteristics of huge data size,rapid data growth and frequent data update.Subgraph query has been widely studied by researchers for its importance to graph analysis.However,with the increasing of graph scale,people tend to focus only on some high-matching results among many query results,and they want to using the number of results and the relationship between query results to quickly obtain the desired results.Therefore,in order to meet users' personalized query needs,the more targeted Top-K interesting subgraph query method was born.Considering the traditional subgraph query algorithms without optimization strategies that are difficult to be applied the subgraph query for large-scale graphs,researchers focus on indexing nodes or edges in graphs according to database indexing technology or reduce the scale of data graphs by means of graph compression technology in order to speed up the query efficiency.However,most of the new subgraph query algorithms ignore the characteristics of rapid data growth and frequent data update in most labeled graphs at the present stage.At the same time,in the Top-K interesting subgraph query,the edge weight is often used to express some restriction relationship between entities,which makes it difficult to directly apply the existing subgraph query algorithms for unweighted graphs.Therefore,how to realize Top-K interesting subgraph query on large-scale and dynamically changing weighted labeled graphs becomes one of the research hotspots in graph data processing.In this paper,the Dynamic Top-K Interesting Subgraph Query(DISQtop-K)method for large-scale weighted labeled graphs is proposed to solve the above problems.DISQtop-K method consists of offline pretreatment and online query.Firstly,in the offline pretreatment,the Graph Compression Topology Feature(GCTF)index is constructed while the scale of the initial static labeled graph is compressed as much as possible.And the graph partition algorithm is executed based on the constructed GCTF index.The offline pretreatment lays the foundation for the subsequent parallel query and the packing-filtering of multiple nodes with the same label and connected and edges that are attached to these nodes.Secondly,in the online query,the candidate sets of each partition are filtered by multiple factors,such as the number of compressed nodes,adjacent node label frequency,edge weights and the maximum weight of the edges that are compressed together.And the filtered candidate sets are matched and the matching results are connected.These steps greatly improve the subgraph query efficiency.Then,using the operation records collected by the sliding window,the initial filtered candidate sets and query results are dynamically modified and the GCTF index is dynamically maintained,which improves the efficiency of dynamic query and the accuracy of dynamic query results.Finally,a large number of experiments on real datasets and synthetic datasets verify that the DISQtop-K method proposed in this paper improves the interesting subgraph query efficiency to some extent through graph compression and graph partition,and improves the accuracy of dynamic query results while improving dynamic query efficiency.
Keywords/Search Tags:large-scale dynamic labeled graph, Top-K interesting subgraph query, graph compression, graph partition, sliding window
PDF Full Text Request
Related items