Font Size: a A A

Research And Implementation Of Multi-Keyword Parallel Search In Streaming RDF Data

Posted on:2019-05-11Degree:MasterType:Thesis
Country:ChinaCandidate:L YuFull Text:PDF
GTID:2428330572495595Subject:Computer technology
Abstract/Summary:PDF Full Text Request
The RDF(Resource Description Framework)is a framework proposed by the W3C(World Wide Web Consortium)for describing semantic web resources.With the full rollout of Linked open data and DBpedia projects,open RDF data continues to emerge.Data processing and response time is getting shorter and shorter,the data changes faster and faster,the velocity of the stream data processing is very important.Therefore,it is a very significant research topic to study real-time analysis and streaming of big data.The main contents of this paper are as followsFirst of all,this paper proposes a multi-keyword parallel search algorithm(MKPSA)combined with ontology query subgraph.This algorithm combines the ontology information of RDF and the distributed Redis database to design a storage scheme for massive data.By constructing the ontology query sub-graph corresponding to the key word set by pruning and merging the associated class graph,this algorithm proposes that ontology query Sub-graph ranking,to determine the priority of the query;combined Hadoop computing framework and storage solutions for distributed parallel search,return to the top Top-k query resultsThen,in view of the disadvantage that MKPSA algorithm needs to be turned on several times when MapReduce starts up multiple jobs and causes the cluster performance to be wasted,and the data volume is too large to store the real-time data,we proposed a parallel keyword search algorithm MPSASR(Multi-keyword Parallel Search Algorithm For Streaming RDF Data)According to the characteristics of streaming data,this algorithm designs a distributed storage scheme and encodes the prefixes of RDF data iin combination with hash coding compression strategy to reduce the memory space occupied by data storage.Then,the Spark framework is used to design the distribution Based on MKPSA algorithm,a real-time top-k query result is obtained based on real-time flow query of MKPSA algorithm and query iteration of MapFinally,the MPSASR algorithm proposed in this paper is used in the "Camp Network Information Service Platform" project.Through the online learning of this project,online education and online evaluation of real-time data accumulation,a large number of officers and soldiers have been formed to learn data sets.Using the ontology construction method to construct the knowledge base of the whole project,combined with the MPSASR.algorithm proposed in this paper,the multi-keyword search function is implemented for the flow data in the project.
Keywords/Search Tags:RDF, Multi-Keyword, Redis, Stream, Spark-Streamin
PDF Full Text Request
Related items