Query Processing Techniques For Large-scale Product Knowledge Graphs

Posted on:2024-08-21

Degree:Master

Type:Thesis

Country:China

Candidate:C X Fang

Full Text:PDF

GTID:2568307157482274

Subject:Computer Science and Technology

Abstract/Summary:

PDF Full Text Request

With the flourishing development of the Internet and the diversified growth of people’s daily life needs,the data generated by online shopping has become an enormous number that is difficult to count.Compared with general knowledge data,product knowledge data has characteristics such as heterogeneity,massive scale,and uneven data distribution.As the scale of the product knowledge graph continues to increase,users’ demands for the response speed of knowledge queries are also increasing.However,existing RDF(Resource Description Framework)knowledge query systems often do not fully consider the structural characteristics of the product knowledge graph,resulting in ineffective optimization of product knowledge retrieval performance.Additionally,large-scale product knowledge query services require real-time and accurate results,as product knowledge needs to be constantly supplemented and updated to meet different types of knowledge query requests.Therefore,high-performance query processing requires excellent scalability to ensure real-time and accurate query processing even after dynamic data updates.This study focuses on the characteristics of product knowledge data and researches the query processing of large-scale product knowledge.The main work includes:(1)Optimizing the storage of data indexing and proposing an RDF knowledge storage and query processing method based on predicate indexing.Based on the structural characteristics of product knowledge data,a data model based on predicate indexing is designed to convert RDF triples into entity pairs of predicate indexing for compressed storage of knowledge data and improved construction and loading speed of data indexing.A query optimization algorithm based on query type selection is designed to ensure that the overall performance of the query remains efficient.The experimental results show that this solution maintains competitive retrieval performance with mainstream RDF query systems while occupying smaller disk space and requiring less time to construct data indexing.(2)Optimizing query efficiency by proposing an RDF knowledge storage and query processing method based on compressed coding tree indexing.The bottleneck of binary join strategy in graph query lies in the redundancy of intermediate results,which leads to the decline of overall query performance.Therefore,based on the idea of the best-case optimal join algorithm,the query execution strategy is redesigned to reduce data redundancy during the query process.Furthermore,to improve the scalability of data indexing,a compressed coding-based index structure is designed to compress and store knowledge triples using numeric encoding and use the ordered structure of B+ trees to improve data indexing scalability.The experimental results show that this solution has good performance in index construction speed and disk space occupancy and has its advantages in knowledge data retrieval performance.

Keywords/Search Tags:

RDF data, SPARQL query, predicate index, compressed encoded tree, query processing

PDF Full Text Request

Related items

1	Research On Predicate-based Query Processing Of XML Streams
2	Research On A Hashing Index Based RDF Data Storage And Query System And Its Application
3	Research On Distributed Query Processing And Optimization Of RDF Data
4	Research On Distributed RDF Query Processing
5	Research On SPARQL Query Engine Across Different Storage Platform
6	SPARQL Federated Query And Its Application On The Semantic Web
7	Research On Multidimensional Top-k Query Processing In IoT Sensing Networks
8	The Research On Structured Query Generation Framework Based On Semantic Query Graph
9	Based On Algebra Tree's ORDBMS Query Optimization Technology Research And Algorithm Design
10	Research On Techniques And Systems For Index And Query Optimization Of Big Data