With the development of computer science and technology,graph data structure are widely used in many fields.How to retrieve and manage these data has become a research hotspot.Subgraph search and graph similarity search are two important kinds of graph search.Subgraph search refers to return the graph set which contains the query graph;graph similarity search is to retrieve all the graph data which is similar to query graph under a certain threshold.In these two kinds of queries,graph isomorphism verification or similarity computation are required,both of which have been proved to be NP-difficult problem,so the way to improve query performance is mainly filtering out non-result set as much as possible.At present,for the two kinds of queries,the basic method is based on the filtering and verification mechanism,that is,use the characteristics of the query graph to filter out the graph data set,get a smaller candidate set,then verify candidate set and return right result.In general,frequent subgraphs in the graph database are used as features to filter,because it retains the global structure information and is more use in filtering.In the sub-graph matching query problem,the common method is using features which is extracted from graph database to build index.When a query comes,expand edge to get the candidate graph set.This approach has too high time complexity,and it does not make full use of the features contained in the query graph to filter the index quickly,this paper proposed a node adjacency feature index named Vindex.In order to reduce the size of candidate sets,we extract the embedding features combined with user's query records in a specified region.This paper also used the relationship between frequent subgraphs and the relationship between feature and data graph to reduce the size of index and candidate sets.For similarity query,most of the existing methods focus on reducing the calculation time of graph editing distance,and no suitable index structure and searching method are proposed for narrowing the candidate set.This paper combine the concept of editing distance with the structure characteristics of query graph to filter the graph database,removing the highest dissimilarity graphs.Then the graphs are clustered by using XOR subgraph feature vector method.A more accurate candidate set can be obtained by using the branch mapping distance combined with the adjustment method.In the verification phase,this paper establish GDindex,which is based on the editing distance relationship.In the process of verification,we first calculate the graph edit distance between graphs in the index and query graph,then use the graph edit distance to process other data graphs in the candidate set.This way,the number of graph edit distance calculation can be reduced.Finally,this paper design experiments to compare with existing methods from three aspects:index establishment time,candidate set size and query elapse time.Experiments were perfonned on the real Pubchem dataset,and the experiment results show that the proposed method is feasible and can improve the performance of graph query. |