Font Size: a A A

Research On Key Technologies Of Query Supported Big RDF Data Compression

Posted on:2021-05-08Degree:MasterType:Thesis
Country:ChinaCandidate:W X WuFull Text:PDF
GTID:2428330614963565Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
While the Semantic Web technologies are flourishing,volume of RDF documents is also increasing at an unprecedented speed,which raises serious challenges on the storage and exchange.Despite the existence of several universal and specific RDF compression approaches,the redundancy of predicates connected with objects still have not been sovled.In addition,the query for compressed data still has the problem of mutual restriction between the data compression ratio and the data query efficiency.The simultaneous improvement of the performance of the two still needs further research.In order to solve redundancy of predicates connected with objects,we propose Delta Encoding for RDF Grouping Compression(DGC)that groups RDF data by predicating groups connected with object.It can reduce more redundancy of predicate.Afterwards we use Delta Encoding to compress grouped subject in order to optimize storage of sequence data.This paper implements data query for the DGC compression algorithm,which satisfies the need of compressed data query based on DGC original struture.Then we introduce the inverted index and wavelet tree into query algorithm,which can avoid full data retrieval and narrow the search range.This strategy improves query performance.The main contributions of this paper are:(1)Propose a less redundant RDF data representation,which further reduces predicate redundancy.Introduce differential coding to optimize the storage space of the subject sequence.(2)Implement data query algorithm based on the DGC data grouping representation,which satisfies the needs of query management of data on lightweight clients.(3)Propose query acceleration strategy for query algorithm of DGC by introducing an inverted index and a wavelet tree.The efficiency of query patterns with known predicates or known subjects are accelerated after adding acceptable cost.Experiment shows that the DGC compression algorithm achieves obviously improvement over existing methods on different datasets.Therefore,the proposed method has better performance in RDF compression scenario.The query algorithm of DGC also shows better performance in different query modes than the current query algorithm.
Keywords/Search Tags:semantic web, rdf, data compression, sparql query
PDF Full Text Request
Related items