Font Size: a A A

The Research Of Regular Path Query On Large-scale RDF Graph

Posted on:2015-02-02Degree:MasterType:Thesis
Country:ChinaCandidate:L X JiangFull Text:PDF
GTID:2298330452459609Subject:Software engineering
Abstract/Summary:PDF Full Text Request
Regular path queries, or RPQs, are basic querying mechanisms on graphs thatplay an increasingly important role over the past decade. In recent years, largeamounts of RDF data are published on the Web since the development of Linked Data.Such a large-scale of data has posed serious challenges to the efficiency of RPQs.In view of the above problems, we devise a Double-Layer Bi-Directional indexstructure that has a linear space complexity for efficient path queries on large-scaleRDF graph data. Implementations based on Bigtable model and the distributed B+tree model are provided in this paper. On the basis of the index structure, we proposeda novel algorithms family, named TraPath, including traversal sub-path searchalgorithm based on DFS, path portioning algorithm based on the indegree/outdegreenodes and scheduling algorithms to support fast regular path queries. Also, parallelimplementations based on Bigdata distributed computation framework are provided.By defining the No Simple Paths and No Counting Results Paths semantics, thecomplexity of RPQs is reduced to the acceptable Polynomial Time. In addition, weconduct extensive experiments to evaluate the performance of our prototype systemwith a real-world RDF dataset from DBPedia.In conclusion, we present a distributed solution for regular path queries. Based onthe special path index structure, a novel algorithms family is provided in this paper.The extensive experiments show that, the regular path queries on large-scale RDFdata solutions presented in this paper has significant performance advantages.
Keywords/Search Tags:Regular Path Query, RDF, large-scale, distributed, NoSQL, bigdata
PDF Full Text Request
Related items