Font Size: a A A

Research On Reachability Query Coverage Over Large Graphs

Posted on:2022-02-14Degree:MasterType:Thesis
Country:ChinaCandidate:J C PangFull Text:PDF
GTID:2518306497472614Subject:Computer technology
Abstract/Summary:PDF Full Text Request
Reachability query processing is one of the basic operations of graph data management and analysis,and has always been a hot issue.Existing methods usually use tree intervals or 2hop labels based on some nodes to speed up query processing.In practice,there are two problems in this method.On one hand,no one studies which index should be used for a given graph;on the other hand,no one studies how many intervals should be used or how many nodes should be used to construct 2hop labels.In this paper,we study the problem of how to compute the ratio of the coverage of the interval label and the 2hop label w.r.t.reachability queries.Specifically,we study the following problem.First,existing methods that use tree intervals to speed up reachability query processing cannot tell how many intervals should be used.To this problem,we propose to solve it by computing the interval coverage w.r.t.reachable queries.Considering that the time complexity of the basic algorithm is too high to be used on large graphs,an optimized algorithm,namely k-DFSIC,is proposed to calculate the interval coverage quickly,which avoids the cost of enumerating the whole transitive closure.This method computes interval coverage by judging whether the reachability query between each node in the spanning tree and its descendants can be answered by interval,such that avoids the high cost of transitive closure enumeration.Based on this result,we can easily determine the appropriate number of intervals under the constraints of the actual application environment.Secondly,existing methods that use some nodes to construct a 2hop labels cannot determine how many nodes are suitable to construct 2hop labels.To solve this problem,a 2hop LC algorithm is proposed to calculate the coverage of some 2hop nodes.During the computation of 2hop labels,the set of visited nodes are divided into different subsets,and organized by a hash counting index,which can avoid the high cost of enumerating the whole transitive closure.Based on this result,we can determine the appropriate number of nodes to construct 2hop labels,thereby accelerating the reachability query processing.Finally,we conduct rich experiments on 20 real data sets.The experimental results verify the feasibility and efficiency of the proposed algorithms.
Keywords/Search Tags:reachability query, interval coverage, coverage of some 2hop nodes
PDF Full Text Request
Related items