Font Size: a A A

Research On Query Expansion Based On Topic Model

Posted on:2015-04-21Degree:MasterType:Thesis
Country:ChinaCandidate:Z Q YanFull Text:PDF
GTID:2308330479989750Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the rapid development of the Internet, more and more information presented on the internet, the information retrieval has become an indispensable method of get messages in daily lives. The emergence of the search engines has greatly meet people’s needs, so that people can easily face the mass of information on the Internet. However, the important reason that affects the user’s searching experience is that the user’s query term is usually short. People often use a few key words to retrieve; it will result in the expression of the query inconsistent with user’s intent, and largely affect the user experience. Many studies have focused on the query expansion techniques, in order to achieve the purpose of improving the retrieval performance, this techniques use the user’s initial query term to get new search term. As an query optimization method for information retrieval, its research is significant.In the query expansion method, if a term added to the original query is not relevant to the query, the retrieval quality will be decline, especially in the Web search; Especially Web documents often cover a number of different topics. To solve these problems, the query expansion method that based on the topic model is proposed, query expansion method based on the topic model considers semantic correlation of topics between the query and the document, so that it attracts more and more researchers.This paper deeply analyses the expansion method that based on the topic model, we propose two query expansion methods, which are query expansion models based on mutual information of topic and query expansion models based on topic word pairs. Both methods use the LDA topic model to improve the retrieval performance.(1)In the model based on mutual information of topics, we use both the mutual information of the query terms and the relevant degree of the topic to expand query terms. We use this method to solve the topic label selection problem in query expansion models based on topic. In order to guarantee the relevance between topic labeling and query words, we use mutual information to get the relevant of the topic words and the query word.(2)In the model based on topic word pairs, we innovative use the topic vector as the semantic similarity between query terms and the candidate words to select the expansion words. We not only consider the correlation between word pairs, but also used the Dice similarity coefficient. We combine the two similarity factors of the query terms and the candidate words to select the expansion words.We put the expansion words into the original query to get new query words for search, and get the finally results. The experiments show that, when compare with the classical methods RM3、LCA and query methods based on topic, our methods can effectively expand the good terms and improve the retrieved accuracy.
Keywords/Search Tags:Query expansion, Topic model, Topic word pair, Mutual information
PDF Full Text Request
Related items