Design And Implementation Of SPARQL Query Engine Based On Heuristic

Posted on:2023-07-10

Degree:Master

Type:Thesis

Country:China

Candidate:X P Dong

Full Text:PDF

GTID:2568306815991199

Subject:Computer technology

Abstract/Summary:

PDF Full Text Request

Since its inception the Semantic Web has been used in a variety of fields such as life sciences,statistics,finance,open science and health.The Semantic Web uses the Resource Description Framework(RDF)as its data format to describe the information in the Web,where the RDF data model consists of a triadic schema,and SPARQL is the Semantic Web’s standard query language for querying RDF datasets.However,as the size of RDF datasets grows larger and the querying of the triadic schema becomes more complex,the querying efficiency of SPARQL gradually decreases,so it becomes a challenging problem to perform efficient queries on massive RDF datasets.Aiming at the problem that the current SPARQL query time is too long,this paper proposes a SPARQL query engine based on a heuristic algorithm,uses the SparkMMAS-LKH optimization algorithm to reorder the triple pattern,and calculates the cost matrix of the triple pattern by calculating the cost matrix of the triple pattern.size to find the best triple join order.In this thesis,the SPARQL query engine based on the Spark-MMAS-LKH algorithm is divided into five parts.The Spark-MMAS-LKH algorithm is located in the core part of the engine query optimization layer,which is responsible for reordering the triple pattern.In the query optimization layer,the optimization process is divided into two steps: the first is to construct the initial weight matrix(ie,the cost matrix)of the triple pattern,and obtain the initial weight by calculating the cardinality estimation value and estimated connection value of the triple pattern.Then the weight matrix is brought into the Spark-MMAS-LKH hybrid optimization algorithm as a parameter,the MMAS algorithm and the LKH algorithm are hybridized in the relay mode,and the RDD operator in the distributed framework Spark is used to speed up the MMAS-LKH The iterative speed of the algorithm,so as to complete the optimization of the weight matrix and find the optimal connection order of the triple pattern.In order to verify the influence of the SPARQL engine designed in this paper,the SPARQL query engine based on the Spark-MMAS-LKH algorithm is compared with other optimized engines and unoptimized original engines in the public dataset LUBM100.It can be seen from the comparison results that the SPARQL engine based on the Spark-MMAS-LKH optimization algorithm proposed in this paper has played a positive role in querying large-scale RDF datasets and achieved the expected results.

Keywords/Search Tags:

Resource Description Framework, SPARQL Query, Max Min Ant Colony Algorithm, LKH Algorithm, Spark Framework

PDF Full Text Request

Related items

1	Efficient SPARQL Theta Join Processing On Large Scale RDF Graphs
2	The Research Of Distributed RDF Data Processing Architecture
3	Temporal RDF Query Language And Its Transformation To TSQL2
4	The Research On Structured Query Generation Framework Based On Semantic Query Graph
5	Framework Description Of Population Migration Algorithm And Its Application
6	Cooperative Query Processing On Heterogeneous Processors
7	Telecom Resource WEB Query Framework Design And Implementation
8	Design And Implementation Of Distributed Intrusion Prevention System Based On Agent And Ant Colony Algorithm
9	A GA-Based SPARQL Static Query Optimization Method
10	An Improved Ant Colony Algorithm Based On Spark In TSP