Font Size: a A A

Algorithms Research Of Reachability Query Based On Double Label In Large-scale Graphs

Posted on:2017-06-02Degree:MasterType:Thesis
Country:ChinaCandidate:R X SunFull Text:PDF
GTID:2348330482999738Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
The increasing of today's data growth has been far beyond people's imagination. Graph, as a typical mode to process mass data has received widespread attention in the research field, for example, biological networks, traffic networks, social networks, RDF data, etc., all can be abstracted as a graph. In recent years, research of reachability query of large-scale graphs has become a hot spot of current research, especially on large-scale directed acyclic graph. As graph scale is much more larger, the graph data indexing and query technology is facing new challenges, the requirement of the accuracy and efficiency of the query is more and more. At the same time, the recent accessible query technology often apply to small and medium-sized graph data and does not have good performance with large-scale graph data processing. And some query technology that can be applied to large scale graph cannot get balance among query efficiency,index building time and system storage. So, to find a new method of query that can get balance among query efficiency,index building time and system storage has became the starting point of this study.In order to get better query in large scale graph, a new double label index (TSIL) query algorithm has been put forward in this thesis.The method is based on a node level for division. Firstly, traverse the figure to get each node's weigh value outside of the situation by calculation, the main focus of nodes is defined as the key, key node represents a series of surrounding nodes with similar properties, covers the characteristics of some nodes, at the same time define index structure for different key node.The advantage of such division is that the query can look for representative right value node in advance, then query through the keys of some small range to improve the efficiency of the query some extent, in order to ensure the quick query in large scale graph. At the same time, as to the circumstance that some nodes on large scale graph may concentrate,an improved index algorithm based on the double label has also puts forward(TSIL+e),which can simplify the query steps to achieve the purpose of saving index space and quick query. Finally, a contrast test of comparing the algorithm with other known algorithms on the same graph has been done to prove that the presented TSIL method has really improved the query efficiency, and this can also test the optimization of this algorithm in large-scale figure with other known algorithms in the query time and system occupation.All in all, after the analysis of the performance between the single label and double label in large scale graph and the character that double labels are fast but not fit for large scale graph, the thesis is to put forward a new algorithm in order to achieve the query in large scale figure and make it better used in in large scale graph.
Keywords/Search Tags:large scale graph, reachability query, double label index, weigh node
PDF Full Text Request
Related items