Font Size: a A A

Research On Graph Query Of Large Data Based On Parallel Processing

Posted on:2018-02-15Degree:MasterType:Thesis
Country:ChinaCandidate:Y T GaoFull Text:PDF
GTID:2348330518458068Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
With the rapid development of the Internet,we have entered a era of data as the king.Not only the amount of data is very large and the data become increasingly complex.How to find out the useful data from the many and complex data and how to optimize the problem has become a very urgent need.At the same time,distributed cloud storage has become a common solution for large data storage.So the problem is transformed into data query based on distributed storage.For query according to people's demand on large-scale distributed storage data,a powerful tool that is commonly used is graph.The data structure of the graph has a strong advantage in deal with the data with reference relationship.So queries against large data can be translated into graph query algorithm.For graph query algorithm,there is one kind of this problem that is how to answer given two nodes in a large data graph whether these two nodes is reachable or not,which is also known as the reachability query problem of graph.In practical applications,the reachability query problem are widely used in many fields and applications,which is of great significance.The traditional approach to the problem of reachability query is either limited to the graph query based on tree,or some is targeted at the database system for a particular graph.Most of these algorithms are widely using index as a tool,but there are many defects in the accuracy and performance when dealing with the large data graph.In view of these meaningful problems,this paper proposes a MapReduce programming model based on Hadoop distributed computing platform,the parallel reachability graph query algorithm,and proposes an index based on six degree of accessibility to solve the problem of reachability queries on local queries.Through these algorithms,we are committed to optimizing the reachability query problem for distributed large graphs,And the data set is used in many industrial applications,and several experiments have been carried out to evaluate the algorithms from several different aspects.The experimental results show that our algorithm is accurate and efficient.
Keywords/Search Tags:Big data, Graph query, Parallel Reachability Query Algorithm, MapReduce
PDF Full Text Request
Related items