Font Size: a A A

The Research Of Property Path Query And Reasoning On Large-scale RDF Graph

Posted on:2015-04-01Degree:MasterType:Thesis
Country:ChinaCandidate:J LingFull Text:PDF
GTID:2348330485993442Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Property Paths Query is a basic query method for managing graph data. It was introduced by SPARQL1.1 Standard and has become W3 C official standard. The existed methods for property paths are all based on multiple indexes, which are neither effective nor have reasoning ability. Besides, with the developing of Linked Open Data, the RDF data are increasing rapidly, and the huge size of RDF tuple data constructs a big graph, which makes it difficult to do query. To find an effective algorithm which can solve this problem has become a challenging and charming work.To solve the above problems, two aspects of work have been done in this research. First of all, Property Paths cannot support nested semantic, so it can not realize RDF Schema either. A prototype system has been implemented in this research, which takes an algorithm based on the theory of automaton. Besides, the reasoning rules has been added implicitly and translated into Nested Regular Expression. In this way, reasoning is supported, the query results are enriched and the accuracy is improved as well as the recall rate. Secondly, the increasing of RDF data makes it impossible for a single computer to do so much computation. In order to solve this problem, Google contributed a novel computing structure named Pregel which could deal with it by sending messages to its neighbors. According to the model of Pregel, Apache contributed an open source realization called Giraph. An effective parallel algorithm has been implemented with it in this research. The algorithm can gain the query results by several super steps and one back trace and finally returns a sub-graph for users, which is much more expressive and useful.Above all, the method introduced in this research, based on Nested Regular Expression and the theory of automaton, can use RDFS semantic to do reasoning without improve the computing complexity. What's more, the algorithm has been combined with Pregel model and parallelized. In the background of big data, parallel algorithm costs much less time and has better scalability. In the end, it has been proved with a lot of experiments that this algorithm is correct and keeps high performance, especially for big data. In comparison to the existed system, it is faster and has better user experience.
Keywords/Search Tags:Property Paths Query, NRE, Parallel, Pregel, Automaton
PDF Full Text Request
Related items