Font Size: a A A

Research On Massive RDF Data Storage And Query Technology

Posted on:2018-07-01Degree:MasterType:Thesis
Country:ChinaCandidate:J YangFull Text:PDF
GTID:2428330596969797Subject:Computer technology
Abstract/Summary:PDF Full Text Request
With the explosive growth of Internet data,it is much more difficult for people to get the information they want in a short time.Semantic Web enables the machine to understand the meaning of the data and helps people access information more quickly.RDF(Resource Description Framework)is the standard data model for exchanging data on Semantic Web,which describes resources and relationships in triples.With development of Semantic Web,RDF data size increases dramatically.Stand-alone RDF storage and query system cannot meet the needs of reality.Large-scale RDF data management has encountered great challenges.So it becomes a hotspot to research the scalable RDF storage and query system in the field of Semantic Web.The existing RDF query system is based on Hadoop and general distributed technology.The disk I/O of the former is too high and the scalability of the latter is poor.Besides,the two systems perform badly in the basic graph pattern query.This paper designs a RDF query system based on Spark and Redis.We use the scale information of the intermediate result to improve the query plan generation algorithm.In order to make the join operation faster,we implement the Map Join algorithm.Our main contributions are described as follows:(1)We speed up of the system's iterative operation,the ID mapping operation and index loading operation by using Spark and Redis which based on memory computation.(2)We improve the query plan generation algorithm by the selection rate of the intermediate results.(3)We make the join operation faster,by using the Map Join algorithm which effectively reduces the network I/O.After the prototype system finished,we test it by the LBUM benchmark.We demonstrate its superior performance in comparison to the Hadoop based approaches,and analyze its additional costs.Our experiments show that the system has higher query performance than other existing distributed RDF query systems.
Keywords/Search Tags:Semantic Web, Big RDF Graph, Spark, Redis
PDF Full Text Request
Related items