Font Size: a A A

Research On Multi-subgraph Matching Technology In A Distributed Environment

Posted on:2023-10-25Degree:MasterType:Thesis
Country:ChinaCandidate:Z TianFull Text:PDF
GTID:2530307097479104Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Graph is an abstract structure with a long history,which originated from graph theory.Today,it has been rejuvenated in the Internet Age and has a wide range of applications,such as social networking,e-commerce,knowledge graphs,urban roads,and bioinformatics.Because of the increasing popularity of graph data,many new graph databases have appeared in the past decade.Among many graph processing algorithms in graph databases,subgraph matching is a fundamental algorithm,and it aims to find all subgraphs of a data graph that are isomorphic to a query graph.Like traditional relational databases,the problem of multiple queries exists in graph databases,i.e.,the system receives multiple query graphs and then needs to perform subgraph matching on these multiple query graphs.However,due to the inherent high complexity of the multi-subgraph matching problem,the large number of queries,and the increasing size of the data,the existing single-machine system has been unable to effectively handle the multi-subgraph matching query problem of massive data.In order to solve the multi-subgraph matching query problem of massive data,we have implemented a novel multi-subgraph matching optimization method based on distributed system.This method is optimized in three aspects: data partitioning,intraquery,and inter-query.The main contributions are as follows:(1)A novel data partitioning method based on locality-sensitive hashing is studied and designed.The data partitioning method evaluates the similarity of data vertices according to the similarity between neighboring vertices of different data vertices,so that similar data vertices are put into the same machine to reduce the communication overhead during task processing.(2)A novel task merging method based on locality-sensitive hashing is studied and designed.This method can merge multiple similar tasks into a bigger task,thus avoiding repeated computation of common parts when similar tasks are executed independently.In addition,task merging can also cut down on the number of tasks,which can reduce the additional task scheduling overhead to a certain extent.(3)A novel query grouping method based on locality-sensitive hashing is studied and designed.The method can group multiple query graphs,so that similar query graphs can be grouped into the same group,and repeated computation among multiple query graphs can be reduced.We conduct extensive experiments on real and synthetic datasets to evaluate the performance of our method.The experimental results show that our method has good scalability,and each optimization technique can produce significant optimization effects.Moreover,when dealing with multiple query graphs,our method outperforms other distributed subgraph matching algorithms by at least one order of magnitude.This demonstrates the effectiveness of our method in handling multi-subgraph matching query problem of massive data.
Keywords/Search Tags:Distributed System, Subgraph Matching, Multi-query Optimization, Locality Sensitive Hashing
PDF Full Text Request
Related items