Font Size: a A A

The Query Recommendation Algorithm Research Based On The Search Logs

Posted on:2014-02-16Degree:MasterType:Thesis
Country:ChinaCandidate:L S JiFull Text:PDF
GTID:2248330395997181Subject:Software engineering
Abstract/Summary:PDF Full Text Request
In recent years, with the advent of the World Wide Web, more and moreinformation appears in the network and the network becomes a carrier.Information retrieval technology develops rapidly and becomes popularity.As a basic application of the Internet in the field of search engine,occupy an important position in the majority of Internet users who usethe network. However, in the vast information resources of network, howto enable a user to obtain the information they need quickly and accuratelyin the network has become a problem.Because the difference between natural language and machine language,a retrieval system cannot recognize the user’s query intentionaccurately. The problem lead to the result of returned existing bias andinaccurate. Through extensive research found, Different users whileproviding the same search term, what he/she want to expressed is not the same. Contrary, in some cases, users submit different search term, butmeaning is the same. Therefore, for the convenience of the usercorrection query words and get other queries expressed similar, the focusof this study is the query recommend. Currently, most of the search enginequery recommended are based on documents or based search engine log. Bythe mining of corpus or log can make possible query recommendation.In this paper, through reading a lot of literature, be familiar withand master the search engine working principle and architecture and haveresearched the basic theory and techniques of Chinese word segmentation,natural language processing and information retrieval model, with thesearch log data as analysis object. This paper focus on personalizedservice function of the search engine query recommendation technology,recommended related query terms by the method proposed in this paper whenthe users submitted query terms. In order to enhance the real-time andaccuracy of the recommended and to provide users with a valid queryrecommendation service, recommended method is divided into two stages inthis paper, the first stage is the offline processing stage, includingdata preprocessing, user query-click bipartite graph building, bipartite graph offline clustering. Due to the above operations takes a lot of timeoverhead, so let these work on the offline stage can guarantee real-timeonline recommendation. Followed by online recommendation stage, obtainthe most relevant query clustering to the current submitted query terms,and then choose the Top-N highest similarity query recommended.Experimental results show that the proposed method is not only to ensurethe recall rate, but also improve the precision. Since the offline stageclustering operation first, can save the user’s waiting time, increasethe friendliness and availability of query recommendation.
Keywords/Search Tags:Query Recommendation, Clustering, Log Mining, VSM(Vector SpaceModel), Similarity Calculation
PDF Full Text Request
Related items