Research On Graph Edit Distance Algorithm For Graph Similarity Search | | Posted on:2024-06-06 | Degree:Master | Type:Thesis | | Country:China | Candidate:J W Zhang | Full Text:PDF | | GTID:2530307157483414 | Subject:Computer technology | | Abstract/Summary: | PDF Full Text Request | | Graph Edit Distance(GED)is a commonly used method for measuring the similarity between two graphs and is widely applied in graph similarity search problems.Graph Edit Distance algorithms are mainly divided into two categories based on whether the obtained cost value is accurate or approximate: exact Graph Edit Distance algorithms and approximate Graph Edit Distance algorithms.Exact Graph Edit Distance algorithms have high time and space complexity and are mainly used for the verification stage of graph similarity search and as evaluation indicators for judging the accuracy of approximate Graph Edit Distance algorithms.However,approximate Graph Edit Distance algorithms can obtain corresponding solutions in polynomial time,and approximate Graph Edit Distance algorithms based on cost matrices have high solution accuracy and have been widely researched.How to improve the solution speed of cost matrices while ensuring high accuracy is a problem that researchers are working to solve.In order to obtain Graph Edit Distance in polynomial time and ensure high solution accuracy,this article mainly focuses on the research of approximate Graph Edit Distance algorithms based on cost matrices and applies them to exact Graph Edit Distance algorithms and graph similarity search problems.The specific research content and results are as follows:(1)In view of the problem that the existing cost matrix solution algorithms have high complexity and cannot directly solve non-square matrices,this article proposes a cost matrix solution model MCMF(Minimum Cost and Maximum Flow)based on the minimum cost maximum flow problem,and based on this,a simplified solution model SMCMF(Simplified Minimum Cost and Maximum Flow)is proposed.The SMCMF model has generality,especially superior performance in solving non-square cost matrices.(2)In view of the problem that multiple shortest path searches are required in the SMCMF model solution process,this article proposes a preprocessing algorithm that can obtain an optimal matching in O(n^2)time complexity,preprocess the SMCMF model to reduce the number of shortest path searches.At the same time,considering that Algebraic Decision Diagram(ADD)can effectively improve the representation ability of pseudo-Boolean functions,this article proposes a symbolic ADD algorithm for solving Graph Edit Distance based on the pseudo-Boolean function representation of the SMCMF model,called ADD_SMCMF algorithm.Experimental results show that this algorithm is superior to the AStar-BMao algorithm and the FBP(Fast Bipartite)algorithm in terms of solving scale and time and space performance on the AIDS and Pub Chem datasets.(3)In view of the high lower bound solving complexity of the verification algorithm AStar-BMao and the problem that it needs to perform accurate verification of each graph in the candidate set during the verification stage of graph similarity search,this article introduces the solution idea of ADD_SMCMF and proposes a new lower bound solving algorithm BP-lb BMao.On this basis,a hybrid algorithm for the verification stage of graph similarity search is designed by combining improved upper and lower bound solving methods,called BP-AStar algorithm.Experimental results show that this algorithm can effectively improve the verification efficiency of the verification stage of graph similarity search. | | Keywords/Search Tags: | Graph edit distance, Best-first search, Cost matrix, Minimum cost and maximum flow, Graph similarity search, Lower bound, Algebraic decision diagram | PDF Full Text Request | Related items |
| |
|