Font Size: a A A

Research And Implementation Of Distributed RDF Keyword Approximate Search

Posted on:2018-04-20Degree:MasterType:Thesis
Country:ChinaCandidate:Y ChenFull Text:PDF
GTID:2428330542476899Subject:Computer technology
Abstract/Summary:PDF Full Text Request
With the rapid development of the Semantic Web,RDF(Resource Description Framework)format data is widely used in many fields such as encyclopedias,geographic information,life sciences and so on.Under the pressure of massive data,traditional RDF keyword approximate search methods can not meet the demand.It is imperative to study the efficient distributed keyword approximate search method for large-scale RDF data.In this paper,we make full use of the semantic information of RDF ontology and propose a distributed RDF keyword approximate search algorithm by combining with the Hadoop platform and Redis database.And this algorithm has higher search efficiency and better search results.Then we propose a real-time keyword approximate search algorithm with the aid of the Storm platform and DRPC(Distributed Remote Procedure Call).And the algorithm solves the problem that Hadoop can not deal with stream data,and further improves the search efficiency.The algorithm of this paper is applied to the practical project by combining with the application scenario of remote monitoring and fault diagnosis in a certain military unit.The main contents of this paper are as follows:Firstly,we propose a distributed RDF keyword approximate search algorithm DKASR(Distributed Keyword Approximate Search algorithm for RDF).This algorithm constructs ontology sub-graphs of the keyword set by using RDF ontology information.And a semantic scoring function is defined to sort ontology sub-graphs.Then we use the MapReduce computation model to achieve distributed parallel search.When the results don't reach Top-k,ontology sub-graphs are extended to generate approximate ontology sub-graphs and the semantic similarity function is used to sort approximate ontology sub-graphs.Then we use MapReduce computation model to realize the parallel search until the results reach Top-k.Secondly,in order to solve the problem of large data storage space and DKASR algorithm can not search stream data in real time,we propose a real-time keyword approximate search algorithm RKASS(Real-time Keyword Approximate Search algorithm based on Storm).And we propose a distributed storage scheme to store stream data in the algorithm.In order to reduce the data storage memory,we propose a hash coding compression strategy to encode and compress prefixes of RDF data,and construct the corresponding hash mapping information.At the same time,we use Storm to achieve real-time inflow of data.And when keywords are mapped and matched,in order to speed up the process of keyword approximate search,we make full use of historical data to skip unnecessary steps.In addition,we use Storm to realize and complete the construction and sorting of ontology sub-graphs,the construction and sorting of approximate ontology sub-graphs,the construction of the result sub-graphs and the reversal of the coding.And we can easily complete real-time keyword approximate search of stream data by calling the DRPC server.Finally,the algorithm proposed in this paper is applied to the remote monitoring and fault diagnosis system for the micro laser equipment.We analyze the characteristics of the micro laser equipment fault case data in the project,construct the knowledge base of the project by using the ontology construction method,and realize the efficient keyword approximate search function in the project.
Keywords/Search Tags:RDF, Keyword Search, Approximate Search, Redis, Storm
PDF Full Text Request
Related items