Font Size: a A A

Research On Exact Subgraph Matching Algorithm Of Label Graph

Posted on:2020-09-19Degree:MasterType:Thesis
Country:ChinaCandidate:S Q ZhouFull Text:PDF
GTID:2428330599959756Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
As an important data representation model,the label graph is widely used in biochemistry,social networks,knowledge graph,etc.Subgraph matching is an important operation method for graph data management,which has attracted the attention of researchers.In this thesis,the subgraph matching problem of the label graph is studied.The subgraph matching algorithm k~2-MDD-SubM for unique label graph and the subgraph matching algorithm LCCP_SubM for duplicate label graph are given respectively.The main work of this paper is as follows:(1)For the unique label graph,k~2-MDD is introduced to make it efficient and compact representation to reduce the query size,at the same time,combined with the logic operation of symbol calculation,a subgraph matching algorithm k~2-MDD-SubM is given,the algorithm transforms the original subgraph matching problem into a logical operation based on the k~2-MDD boolean function.Experiment on Graemlin and PPI data sets,for the Graemlin dataset,the experimental results show that the number of nodes that k~2-MDD-SubM needs to store is only 35.83%of the RI algorithm,when the size of the pattern graph is fixed,the query time of the algorithm is gradually reduced as the density of the pattern graph is gradually reduced,and when the pattern graph is the same as the target graph,the query efficiency is better than that of the RI algorithm.The superiority of k~2-MDD-SubM algorithm in storage and query performance is verified.For the PPI dataset,the experimental results show that the number of nodes required for k~2-MDD-SubM is only57.19%of the RI algorithm,as the density of the pattern graph is reduced,the query time is gradually reduced,which further verifies the effectiveness of the algorithm performance.(2)For the duplicate label graph,a new static variable ordering strategy is designed,in order to reduce the search space and the number of backtracking as much as possible,the ordering strategy considers the heuristic rules such as vertex local clustering coefficient and label probability in turn.At the same time,simple constraint filtering condition such as vertex label is combined to ensure the correctness of the matching result and pruning the branch that does not satisfy the matching,thereby a subgraph matching algorithm LCCP_SubM is given.Experiment on two real data sets of AIDS and NASA.For the AIDS dataset,the performance of LCCP_SubM and RI algorithms is compared and analyzed from the aspects of matching quantity,search space,matching time and total time.The experimental results show that the search space of LCCP_SubM algorithm is significantly reduced compared with RI algorithm when the number of matching is the same,and the matching time and total time are better than RI algorithm.The validity of the variable ranking strategy and the efficiency of the algorithm are verified.For the NASA dataset,the experimental results show that the search space size of LCCP_SubM is significantly smaller than that of RI algorithm,which further validates the validity of the variable ordering strategy in the LCCP_SubM algorithm.
Keywords/Search Tags:subgraph matching, unique label graph, duplicate label graph, k~2-MDD, variable ordering
PDF Full Text Request
Related items