Font Size: a A A

Query Recommendation Algorithm Based On Bipartite Graph

Posted on:2015-01-22Degree:MasterType:Thesis
Country:ChinaCandidate:L ZhuFull Text:PDF
GTID:2268330428466214Subject:Software engineering
Abstract/Summary:PDF Full Text Request
Currently,the Internet has become one of the world’s largest knowledge base and it contains vast amounts of information,the network information that people can get increasing.Users will often at a loss on how to more quickly and accurately find the information they need when faced with large-scale network information. Search engines can help people get information from massive data,has become one of the most important access to the network information and even indispensable tool.However the search engine and user interaction is still the main way by the user according to the information needs of independent input query keywords to search, and the search engine returns query results.Due to the query words entered are generally more brief, and ambiguity of the query words, the search engine does not accurately understand the real search intent of users.Based on this background query recommendation technology has now been widely used by search engines, helping search engines to more accurately understand the user’s real query intent and construct more sophisticated query.This paper mainly studies the query recommendation algorithm based on bipartite graph.Sougou query logs is used as the experimental data set.After analysis of the data sets and preprocessing, extracting310,000click through data as experimental data.The rank number of URL that user clicked in the search engine results list and the order number that user clicked are considered into weight formula of edges in bipartite graph,and then use TF-IDF ideas to calculate weight of edges,in order to get a Query-URL weighted bipartite graph.The URL set that users clicked are used to construct vector and represent the corresponding query,then calculate any similarity between the two different queries by using cosine similarity method.Finally, construct a description of the correlation between queries which called relational query network diagram.The process of recommending N candidate for input query is: first,find neighbor nodes of the node with the input query in relational query network and construct candidate set H. If the number of candidates in H is no less than N, then select top N queries that have higher similarity score with input query as recommended candidates. Else if the number of candidates in H is less than N then nodes that in h-hop range and indirectly connected input query are added to H. Clustering queries in H set by taking advantage of k-means algorithm.At last, sort queries in the cluster which contains the input query, and recommend top N candidate queries that have higher similarity score.Experimental results show that the query recommendation algorithm of this paper has good recommendation results,besides this algorithm has a certain practical value.
Keywords/Search Tags:weighted bipartite graph, query recommendation, relational querynetwork, k-means clustering
PDF Full Text Request
Related items