Font Size: a A A

Design And Implementation Of Efficient Subgraph Matching Algorithm On Large

Posted on:2017-01-08Degree:MasterType:Thesis
Country:ChinaCandidate:A N JiFull Text:PDF
GTID:2278330485491393Subject:Software engineering
Abstract/Summary:PDF Full Text Request
In the era of big data, there is a massive multi-source and heterogeneous datawith complex and close relationships, using the graph structure as described complex relationships among data are widely used in various fields.However, with the rapid development of these fields, the amount of data using graph structure is also exponentially exploding.And subgraph matching as the basic operation in graph processing,get abundant attention from academia and industry.Recently, how to efficient base on user-defined query structure to get matches which meet user-defined requirements on large-scale graphbecome a primary problem. And previous work on subgraph matching don’t consider user-defined requirements for some special edges so they are powerless against query graphs with weights and existing algorithms are not suitable for large-scale graphs.In this paper, it does someimprovement and innovation in the shortcomings of subgraph matching. The main research achievements and contributions are as follows:Personalized subgraph matching in large-scale graphs is discussed. Firstly, we partition the large graph into some sub-regions by GN algorithm and buid two index structures:GP-Tree and Sorted List(SL) index which are constructed offline.Then, based on the index structures, we use optimization strategies into our method to accelerate subgraph matching. Finally, our experimental results evaluate the efficiency and scalability of our proposed method.Parallel personalizedsubgraph matching in large-Scale graphis discussed. Firstly, we partition the large graph by the Spark Graph X’s Vertex-Cut algorithm and build two index structures for the network: GP Adjacency index(GPA) and Sorted Lists index(SL) which are constructed offline. Then, based on the index structures, we propose an efficient query algorithm to achieve parallel personalized subgraph matching(Par_PSM)using Spark.Finally, our experimental results on large-scale data sets evaluate the efficiency and scalability of our proposed method.
Keywords/Search Tags:subgraph matching, subgraph isomorphism, graph patternmatching, graph partition, graph data
PDF Full Text Request
Related items