Font Size: a A A

Reseach On Optimizing Top-k Join Queries Based On SPARQL-RANK

Posted on:2016-05-31Degree:MasterType:Thesis
Country:ChinaCandidate:F J ChenFull Text:PDF
GTID:2298330467490869Subject:Software engineering
Abstract/Summary:PDF Full Text Request
Recently,in order to quickly find and collect information resources on the web andprovide users with efficient and accurate acquisition technology for data sharing andreuse, searching and ranking semantic data with SPARQL has become a researchhot-spot.Traditional SPARQL top-k join queries are managed with a materialize-then-sortprocessing scheme,which is often inefficient. With the increasing complexity ofranking functions defined by ORDER BY phrases, traditional ranking methods areunable to meet the needs of efficient queries. Meanwhile,the data storage and accessmodel have a great influence on the top-k join query, but there is no much researchwork on top-k join queries optimization in RDF native stores. This thesis proposes anew rank-join operator algorithm-ERA-RJN on the basis of SPARQL-RANK algebra,making use of the advantage of random access availability in RDF native storage.Algorithm optimization strategy mainly includes the following two parts:(1) speedingup exploration of effective connectivity mapping by combining two-way random accessand parallel sorted access;(2) deleting repeated connections caused by two-wayexploration with repetition-eliminating strategy, and calculating the threshold accuratelywith quick termination strategy to finish extracting top-k results as soon as possible.This thesis implements the ERA-RJN operator on the ARQ-RANK platform, andperforms experiments, verifies the high efficiency of ERA-RJN algorithm dealing withSPARQLtop-k join query in RDF native storage.Moreover, in order to estimate the cost of ERA-RJN operator on the massivesemantic data, this thesis design a tuple scanning depth estimation model. Experimentsprove the validity of the model in allowable error range.
Keywords/Search Tags:RDF native storage, SPARQL-RANK, top-k join queries, rank-joinoperator, random access
PDF Full Text Request
Related items