Font Size: a A A

Technology Researches Of Query Refinement Based On User Intent

Posted on:2017-08-07Degree:MasterType:Thesis
Country:ChinaCandidate:J L WuFull Text:PDF
GTID:2348330518970792Subject:Software engineering
Abstract/Summary:PDF Full Text Request
Nowadays,with the aid of search engine,users can obtain information from the network, and it greatly relieves user's information anxiety. With the short of query words,it's easily to generate fuzzy ambiguity, search engine which is based on keyword matching does not recognize polysemy situation. Query refinement technique is one solution of identifying user intent. However, in this technique, the session segmentation method inevitably has some defects. Candidate queries generated by sessions co-occurrence information are more likely deviate from the original user intent,leading the presence of overlapping problems when it identifies user intent by exploiting query refinement.After deeply studying the technology and theory of query refinement, based on user intention recognition, this paper constructs a query refinement model by AOL query logs.This paper mainly discusses how to divide query intent and how to identify the query reformation which can express the user intent of original query and how to cluster the query intent. Due to some problems existing in algorithms,this paper focuses on the improvement of the model.The model is divided into three parts: identifying user sessions, calculating original query's query refinements and clustering query refinements. In the first part, in order to solve the problem of lexical similarity with the query,this paper uses the click similarity of queries. In the second part, the candidate queries generated by co-occurrence information are prone to deviate from the original user intention, considering four aspects, such as the mutual co-occurrence information query, the query expression similarity between query words, query time distance and the click similarity between query words,it can calculate the probability of candidate query which can express the user intent of original query. Last but not the least, there are overlapping issues against the user intent by exploiting query refinement, this paper address the problem of clustering query refinements. Specifically,when it begin to cluster query refinements, it also creates two problems: sparse vector dimension and transition probability compute inaccuracies. In order to solve the query vector dimension sparse issues, this paper perform multiple random walks on a Markov graph that approximates user search behavior. In order to solve the transition probability which isn't calculated accurately, considering those factors such as the URL, sort number,order number,like TF-IDF thought this paper defines a similar model named CF-IQF to calculate weight of the edge. And then calculate the absorption state distribution,reconstruction query refinement vector. Finally, this paper use cosine similarity to compare two discrete probability distributions, meanwhile, pick the pair of clusters that have the highest value for complete-link similarity and merges them. Experimental results show that the model and the algorithm are efficient.
Keywords/Search Tags:User intent recognition, Query refinement, Random walk, Clustering query refinements
PDF Full Text Request
Related items