| Graph pattern matching has been widely used in today’s society,such as social network user re-recognition,and social network anti-anonymity.Pattern matching is mainly divided into single pattern matching and multi-pattern matching.Single pattern matching is a process of finding all matching result sets that match the pattern graph in the data graphs,using sequential and independent execution steps.However,in practice,it is necessary to batch process multiple patterns.Because there are common subgraphs between multiple patterns,sequential and independent execution processes will lead to repeated calculation problems.Therefore,matching process is no longer applicable to multi-pattern matching problems.The problem of multi-pattern matching is an extension of pattern matching,whose main challenge is the concurrent execution strategy among multiple query graphs.To solve this challenge,a multi-pattern matching method for RDF graph(M-PM method for short)was proposed.Firstly,the clustering algorithm is used to divide the pattern graph into k classes,so that the dependence between the pattern graphs in each class is the strongest,and the dependence between the classes is the weakest.Secondly,for each class,the common subgraph among multiple patterns is mined by the maximum common subgraph algorithm,and the inclusion relationship among the pattern graphs is constructed into a dependency graph.According to the proposed pruning strategy,the redundant nodes and edges on the dependency graph are deleted to form a dependency tree(D-Tree).After that,a node fragmentation table(NFT)was proposed to extend the single inclusion relation of D-Tree.Finally,a speed up algorithm of multi-pattern matching was designed,which can obtain the results of multiple query graphs by one-off traversal of data graph.This thesis conducts experiments on three synthetic and real data sets,and compares with three representative algorithms.Experimental results show that the proposed M-PM method has improved execution efficiency compared with other algorithms.With the fixed scales of query graphs,matching time-efficiency is consistent with the quantity of residual edges,where the more the residual edges are and the higher the time-efficiency is. |