Font Size: a A A

Research On Personalized Information Retrieval Based On User Search History

Posted on:2016-01-19Degree:DoctorType:Dissertation
Country:ChinaCandidate:X C WangFull Text:PDF
GTID:1108330503469584Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Personalized information retrieval tailors the ranking of documents by taking into account individual interests, which has long been recognized as an avenue to greatly improve the existing search experience. Personalized information retrieval values the user and provides a user-dependent search result. It considers not only the matching between the query and the document, but also the matching of documents and user’s interests and preferences. As the best way to obtain the individual interest and preference, user search history containing rich information about has attracted many researchers’ attentions in personalized information retrieval. In order to investigate the different effects of different history to personalization, this paper first analyzes and quantifies the similarity between search history and current search result, and then stuties the methods of using short-term history, long-term history, as well as combining both to improve the retrieval performance.1) According to the relationship between search history and the retrieval result, quantitative analysis of the similarity between the user short-term and long-term search history and queries and clicks. This paper analyzes th eir relationship with the vector space model framework from the following four aspects: the similarity ratio, the degree of similarity, the linear relationship and differences on the specific content. The analysis results reveal that: 79.55% of the queries can obtain relevant information from the user’s history, with a short-term history covering a larger proportion(71.23%), as well as the high degree of similarity; For the same query, different user history provides different relevant information, each ot her’s binding is likely to further improve the retrieval performance.2) On the method of using short-term search history rationally, a novel personalized retrieval method with short-term historical weight adaptive is proposed. To assign the better weight of short-term history, the similarity between short-term history and the current query is considered as the core clues. Rich interactive features are extracted from the current query, a short-term historical queries and clicks to characteristic the user interest and perference, and the SVM regression model is established to predict the weight of the short-term history. Experiments show that the proposed method can adjust to the retrieval environment to dynamically assign of weights for each query’s short-term history, effectively enhancing the personalized retrieval performance.3) On the method of using the long-term history, this paper introduces the incremental hierarchical clustering algorithm to constructing the long-term interest models accurately, leading to a updated query model estimation. A long-term history accumulated and updated continuously with time sequence, contains rich but dispersed content. Therefore it also contains information irrelevant with the current query in the long-term search history. In order to deal with this problem, this paper adopts the incremental hierarchical clustering algorithm to gradually build the long-term interest tree for individual use, and only use the information in the most similar interest node in the tree to help estimate long-term interest model, as a beneficial supplement to the query. Experiments show that, the method of incremental hierarchical clustering the long-term history can significantly outperform the existing methods, solving diversity and dynamic problem of long-term history, and can better improve the performance of personalized retrieval.4) On the basis of the above research, a personalized retrieval framework with short-term and long-term history fusion is designed. Effects of short-term and long-term history to search results are different from each other, the method in this paper is to integrate a query and a document together. One hand, the query model is accurately estimated with long-term search history, on the other hand the document value differences among users are considered. In the experiment, a comprehensive comparison with various combinations of search history is performed, the results show that short-term and long-term history fusions is better than those using only one kind of history, but the best performance arrives when short-term and long-term search history confusion occurred on both a query and document.
Keywords/Search Tags:Personalized information retrieval, User search history, Query model, Incremental hierarchical clustering, Short-term and long-term history fusion
PDF Full Text Request
Related items