Font Size: a A A

Research On Keyword Search On Graphs Based On MapReduce

Posted on:2018-04-04Degree:MasterType:Thesis
Country:ChinaCandidate:H C YuFull Text:PDF
GTID:2348330536979665Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
With the explosive growth of data on the networ and the generation of a large amount of graph data,the keyword search on the graph has been highly concerned by the researchers.The keyword search algorithm on the graph is different from other keyword search algorithms,whose query result is a subgraph of the original graph containing all the search keywords.The basic idea of keyword search on graph is to find some nodes containing one or more keywords,and then to find the root nodes that can reach these nodes.However,the algorithm has the following deficiencies.First,during the query,it does not consider the matching degree between the search keyword and the subgraph.Second,the existing algorithm only considers the weight of the edge while neglecting the weight of the node.Third,with the increasing size of the graph data,the efficiency of the existing algorithm is getting lower and lower.Our research work can be divided into the following steps.In the first step,we transform the data set into graph structure data.In the second step,we divide the large graph into several r-radius subgraphs,and then calculate the weights of the subgraphs.In the third step,we use the standardAnalyzer in Lucene to segment the information contained in subgraphs to extract the keywords,and then calculate the relevance between the keyword and the subgraph according to the TF-IDF algorithm,and finally use the MapReduce,a distributed computing framework,to construct the inverted index file.In order to solve the problem of low efficiency of centralized search,this paper proposes the algorithm that the keyword search on graph based on MapReduce,and at the same time,we also design the prototype system of the keyword search on graph.The theoretical analysis and experiments show that,the method proposed in this paper can effectively solve the problem that the algorithm of keyword search on graph is inefficient for large scale data.The main contributions of this paper are as follows.First,during the sorting of the result,we not only consider the edge weight but also take full account of the node weight.Second,we consider the correlation between the keyword and the subgraph.Third,we propose the algorithm that the keyword search on graph based on MapReduce,which aims to solve the problem that the centralized search is inefficient for the large scale data.
Keywords/Search Tags:earch, MapReduce, Inverted index, Distributed
PDF Full Text Request
Related items