Font Size: a A A

Research On Exact Subgraph Search Technology In Graph Database

Posted on:2015-10-11Degree:MasterType:Thesis
Country:ChinaCandidate:H H WangFull Text:PDF
GTID:2308330479489772Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Graph is a generic data structure, which can express more complex structure information than path and tree structure, such as molecules, social networks, and images. In recent years, graph structure is succeeding applied in various fields, thus graph data has quickly accumulated and increased exponentially. The condition leads graph data management becomes an important field and graph querying technology becomes one important research branch of it. Graph querying has its own difficulties and characteristics. First, data structure is too complex to manipulate. Second, subgraph isomorphism has been proved to be NP problem, which is an operation that graph querying area cannot be avoided. Third, graph data types are so complicated. It is these difficulties that leading graph data querying research is full of challenges and opportunities.This paper is focused on exact subgraph querying technologies, the major work results are as follows.First, choose an efficient frequent subgraph algorithm as this paper ’s off-line mining method among a lot of classic methods. Comparing experiments among the AGM, FSG and g Span algorithms, noting the DFS mining method –g Span is more versatility and efficiency than BFS mining methods. First, coding the graph can avoid subgraph isomorphism. Second, not using Apriori method to iterate, it can save a plenty of Intermediate calculations.Second, combined with the frequent mining pattern algorithms, we have proposed a novel technique based on frequent subgraph mining, which is called AG-Index, it is an adjacent edge pair hash indexing method. Based on the filter- verify model, AG-Index put the frequent part in the cache and other part on disk. After intersection, verify the candidates with VF2 algorithm. A lot of experiments proved that Ag-Index has a better filter power and lower memory possession than g Index and FG-Index. After that, I also made some optimize on AG-Index, experiment proved the filter power is better than AG-Index and can make a big difference on the whole efficient of the algorithm.Third, design subgraph isomorphism algorithm, combined with the existing Ullmann and VF2 algorithms. First, we pruned Ullmann using adjacent label structural information. It proved the cost is lower an order of magnitude than before. Second, paper also optimized VF2 by adjusting the traverse order of vertices. test proved that the traversal order of vertex degree rather than the order of label can have a better effect, which can pruned the unsuitable condition as soon as possible.
Keywords/Search Tags:subgraph query, frequent subgraph, filter-verify, subgraph isomorphism
PDF Full Text Request
Related items