Font Size: a A A

Research On Multi Keywords Query Technology Based On Graph

Posted on:2017-02-21Degree:MasterType:Thesis
Country:ChinaCandidate:Y C ZhangFull Text:PDF
GTID:2308330488497110Subject:Software engineering
Abstract/Summary:PDF Full Text Request
Over the past decade, the rapid development of the Internet makes the information on the Internet an explosive growth. How to find the information we are interested in such a huge data becomes a problem that cannot wait to be solved. Therefore, the search engine arises at the historic moment. The keyword search is the most commonly used mechanism for search engine.This paper begins with a discussion of the graph data storage and processing, including the use of the open source framework named neo4 j graph database for data storage, storage adjacency matrix of graph with K2 tree, makes division of large data graph based on the radius r and clustering subgraph based on K-means. Then, to deal with the text information of the subgraph to carry out word segmentation, delete stop-word, extract features, according to score with the sorting function to build inverted index. In this paper, simhash is used to process the inverted index table, which is mapped into several index list. For query results, the use of LDA(Latent Dirichlet Allocation) topic model on the topic of filtering makes the results more consistent with the user’s query intention.The method in this paper has the following advantages: Firstly, a graph similarity calculation method based on the combination of text and structure is proposed, which fully takes into account the similarity between the texts of different nodes. So the similarity calculation is more reasonable and accurate. Secondly, we use simhash on the inverted list and take the hash value as index terms to improve the efficiency. Thirdly, the LDA model is used to filter the initial query results, and the results are more consistent with the user’s query intention.Experiments show that this system can be used to query the desired information quickly through the keywords, and the results are more consistent with the user’s query intention.
Keywords/Search Tags:Graph, Keyword Search, Hash, LDA Model, Indexing
PDF Full Text Request
Related items