Font Size: a A A

Research On Query Suggestion Using Click-Skeleton Graph

Posted on:2014-08-07Degree:DoctorType:Dissertation
Country:ChinaCandidate:Z LiaoFull Text:PDF
GTID:1268330425985835Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Today search engine has become an important way to find information in our daily lives with the progress of information retrieval technologies. To efficiently find relevant information from search engine, users need to submit accurate queries. However, due to the constraints of cognitive level and personal habit, the queries submitted by users are often short and irregular, which makes it difficult for search engine to understand user intention. Therefore, query suggestion is proposed to help users formulate their queries. The essential goal of query suggestion is to offer related suggestions to users after understanding user intentions behind their queries.Search logs provide practical data foundations to query suggestion, since it records real user search activities which can be used to find related queries for user input queries. However, huge amount of increasing web search logs bring two challenges to traditional query suggestion methods. First, how to obtain accurate information need description under large scale search logs. Second, how to correctly understand user intention using ambiguous search logs. Traditional query suggestion approaches ignored the accuracy of information need description and did not systematically model the user process to understand user intention, and therefore their suggestion performance are limited.To address the above challenges, we creatively propose Click-Skeleton graph to mine the principle components of search logs. Click-Skeleton graph unveils the major "transactions" between queries and URLs, and thus it can obtain the major and representative description of user information needs. Based on Click-Skeleton graph, we leverage search context and task information in user search process to systematically capture the user search context for correctly understanding user intention. The major content of the paper is listed as follows.First, to mine accurate descriptions of user information needs from large scale search logs, we propose to mine the principle components of click-through bipartite graph as the Click-Skeleton graph. Specifically, we build the optimization problem of Click-Skeleton graph to keep query-URL pairs with highest click frequencies in the click-through bipartite graph, which results in the most representative queries associated with their most frequently clicked URLs. To extract Click-Skeleton graph in a large scale click graph, we propose a distributed extraction algorithm based on Map-Reduce programming mode to overcome the constraints of memory and disk space on a single machine. Based on the extracted Click-Skeleton graph, we propose a skeleton-based random walk algorithm to improve the accuracy of suggested queries by filtering inaccurate or non-representative suggestion candidates.Second, to understand user search intentions, we propose a variable length Hidden Markov Model (vlHMM for short) to systematically model the search context. Here context represents user search activities during a certain time. Based on vlHMM, we can capture user search context, obtain the high order dependency between queries, and represent user intentions with hidden states. To address the challenges of parameter estimation for vlHMM, we propose a distributed Expectation-Maximization method to learn model parameters. Based on learned vlHMM, we can dynamically capture user search contexts and use the query distribution of hidden states to provide context-aware query suggestion.Third, to capture user atomic information need behind search context and model the changes of user search needs, we propose to segment search context into search tasks. To mine search tasks within search context, we propose a task extraction approach based on query clustering. Specifically, we learn query similarity using a supervised learning method and then group queries into tasks based on a near neighbor clustering method. Further, to address the problem of ignoring search task information in existing query suggestion methods, we design a task-based random walk query suggestion algorithm, which aims to provide task-related recommendation and improve the relevance of query suggestion.We have conducted extensive experiments and analyses using large scale search logs from a commercial search engine Bing. The experimental results show that Click-Skeleton graph can obtain the accurate description of user information need to improve the accuracy of query suggestion. Based on Click-Skeleton graph, modeling search context and search task can help understanding user search intention and thus can improve the relevance of query suggestion.
Keywords/Search Tags:query suggestion, search log mining, click skeleton graph, search contextmodeling, search task modeling
PDF Full Text Request
Related items