Research And Implementation Of The Query Processing Algorithms For Web-scale RDF Data

Posted on:2015-10-09

Degree:Master

Type:Thesis

Country:China

Candidate:X D Ye

Full Text:PDF

GTID:2308330482954491

Subject:Computer application technology

Abstract/Summary:

PDF Full Text Request

Nowadays, the rare semantic information of the network resources is one of the main limitations of the Internet development. As the organization of the Internet is based on hyperlinks, it only knows how to display the resources rather than recognize the implication of the resources. The RDF (Resource Description Framework), proposed by the W3C, has become the standard Description Framework of the semantic web. With the development of the information extraction technology and the semantic web, a larger number of RDF data appear in the web. Thus, the storage, management and retrieve large RDF data turn out to be a difficult problem that is urgent to be addressed. SPARQL, proposed by the W3C, is the standard query language for RDF data.The challenges of the algorithms for RDF queries are as follows. (1) They can’t answer SPARQL queries with wildcards in a scalable manner. (2) They can’t handle frequent updates in RDF repositories efficiently. (3) They can’t support large data sets. Based on the above three problems, we propose algorithms based on indexes and the distributed environments.In order to solve the above problems, firstly, in Chapter 3, we propose an algorithm based on indexes. (1) We use the graph model, i.e., the adjacent lists, to store RDF data. (2) Based on the RDF structures, we add a label to each entity vertex and class vertex. Then, we develop a novel index, VS*-tree, to efficiently search the label information. The index has a low maintain cost and is easy to be updated. (3) According to the labeled information of the RDF data, we propose a pruning algorithm that can be perfectly embedded into text query algorithms. The pruning algorithm can be applied in not only the general SPARQL queries, but also the SPARQL queries with a wildcard.Secondly, according to the characteristics of the RDF date, we raise:(1) leveraging state-of-the-art single node RDF-store technology. (2) Partitioning the data across nodes in a manner that helps accelerate query processing through locality optimizations. (3) Decomposing SPARQL queries into high performance fragments that take advantage of how data is partitioned in a cluster.At last, extensive experiments confirm the efficiency and effectiveness of our solution...

Keywords/Search Tags:

Semantic Web, RDF Data, SPARQL, Distributed

PDF Full Text Request

Related items

1	Semantic EMR Data SPARQL Query Optimization Mechanisms
2	The Design And Implementation Of SPARQL Based Semantic Web Data Retrieval System
3	Research And Implementation Of The Query Processing Algorithms For Web-scale RDF Data
4	SPARQL BGP Query Engine Based On BSP
5	Research On Distributed RDF Query Processing
6	Fuzzy SPARQL Query Over XML
7	Semantic integration of coastal buoys data using SPARQL
8	Research On Linked Stream Data Query Method Based On SPARQL
9	Research For Modeling Semantic Web Service Base On Semantic Templates
10	An Analytical System For Large Scale Semantic Data