Font Size: a A A

Research On Large-Scale Graph Subgraph Query Method Based On Feature Index

Posted on:2019-08-13Degree:MasterType:Thesis
Country:ChinaCandidate:J Y GaoFull Text:PDF
GTID:2428330545454780Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
With the constant development of science and technology,various subjects are increasingly tied to the computer field.As an important data structure,graph has a wide range of applications.Protein networks,social networks and electronic commerce networks are all the data structures of the graph model.With the multiplied Internet users and in-depth study of various subjects,the scale of graph data is gradually increasing,which forms a complex and dense large scale graph structure.Effective management and mining of mass graph data become the key issue.In recent years,the applications of subgraph query on large-scale graph are often in the spotlight.The traditional subgraph query algorithms are appropriate for small scale graph query.How to optimize the storage of large-scale graph structures and find the subgraphs with the structure in massive graph data effectively become a challenge of current research.Therefore,based on the advantage of index query,the thesis presents a query method to construct feature indexes on the graph data,which extracts node features and establishes the index structure offline,and traverses the index online.Based on the index,the thesis separately studies two query patterns: star structure and non-star structure.In the subgraph query of non-star structure,the thesis defines the concept of graph pattern decomposition and builds the graph models for the partial solutions.After pre-processing joins,the thesis calculates the minimum cost by using multiple join method so as to determine the join sequence,which can lead to the minimum size candidate sets and keep the query structure as small as possible.All of these can effectively improve the speed of isomorphism detection and query efficiency.The research findings of the thesis mainly consist of:(1)Presenting the definition of the feature list with label and number of the adjacency node,which stores graph by the means of converting the data graph structure into the feature list.Then classifies nodes by the features,which is the node label,the node degree,the adjacency node labels and the number of the lables of adjacency nodes,After that,according to the nodes lable,storing them and theirfeature information into the different lists.The function of the feature list is to speed up constructing indexes.Extracting feature offline and storing into feature tables speed up the construction of indexes.(2)In this paper,we propose a method to construct the dulaq-index by using the feature table,and each feature list constructs a feature index tree.The upper index and the lower layer index are designed according to the inclusion relationship with the features in the feature list.Index value is designed according to whether the number of adjacent node labels is unique.The leaf nodes store the corresponding nodes information.Experiments show that this method can improve filtering efficiency and speed up the subgraph query process.(3)According to the particularity of the index,the thesis has researched the query method of star structure subgraph.The thesis extracts the star-core feature of the star structure graph,and filters the candidates to get the result set by traversing index.After that,depending on the specificity of the storage structure,it processes the result set and then gets the final query result.This query algorithm greatly improves the filtering ability and omits the subgraph isomorphism process of the candidate set.Compared with the widely used subgraph isomorphism algorithm,this algorithm can effectively shorten the time cost of subgraph query.(4)Aiming at the query for a non-star structure query graph,the thesis presents the structural decomposition,subterm filtering,solution connection and result set isomorphism for non-star graph query.At the same time it defines the concept of schema decomposition.Combined with MVP multiple join query method based on the graph model,it can connect the partial solution with the minimum cost.The result set is a small scale graph structure.Finally get the final query result after depth first traversal of the candidate set.Experiments prove that this algorithm can greatly reduce the candidate set scale and improve the query efficiency.
Keywords/Search Tags:Subgraph query, aspect indexing, Star pattern, Schema decomposition
PDF Full Text Request
Related items