Font Size: a A A

On The Statistical Analysis Of Practical SPARQL Queries

Posted on:2018-10-02Degree:MasterType:Thesis
Country:ChinaCandidate:X W HanFull Text:PDF
GTID:2348330542479619Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
The Resource Description Framework(RDF),recommended by the World Wide Consortium(W3C),is a directed,labeled graph data format for representing information in Semantic Web.The SPARQL query language released by the RDF Data Access Working Group,is the standard language for querying RDF data.The current version 1.1 of SPARQL extends 1.0 with features in a more effective manner.The existing optimization methods mainly focus on all kinds of queries,it has encountered a bottleneck since basic subgraph pattern matching is a NP-complete problem.This paper analyses millions of SPARQL queries in LSQ from query features especially the semantic features,to improve the query efficiency of SPARQL query in the future.We firstly classify the queries to monotonicity and non-monotonicity according to the operators especially OPT operator which SPARQL query contains.Monotonic queries are appropriate to distributed concurrent optimization.Then we filter accurately weakly monotonic queries from non-monotonic queries according to closely relationship between weakly monotonic and well-designed queries.Weakly monotonic queries are the queries with lower executive complexity in non-monotonic queries.Finally,we develop a sound algorithm for the satisfiability of weakly monotonic queries by judging whether the closure of FILTER operator is closed,and we proved the satisfiability of all non-monotonic queries in LSQ.The results show that for all the real world queries,more than half of the queries(65.61%)are monotonic queries,the vast majority of the queries(99.96%)are weakly monotonic queries,and the meaningless queries is very rare(0.01%)in the practical world.In this paper,semantic analysis especially weakly monotonicity and satisfiability against millions of practical SPARQL queries provide a realistic basis for the study on these open problems.We believe the analysis will provide heuristic help for developing some practical heuristics for processing SPARQL queries,especially monotonic and weakly monotonic queries can be optimized by concurrent and reusing query result.
Keywords/Search Tags:RDF, Well-designed patterns, Monotonicity, Satisfiability
PDF Full Text Request
Related items