Font Size: a A A

Research On Fusion-Based Methods For Search Result Diversification

Posted on:2018-01-15Degree:MasterType:Thesis
Country:ChinaCandidate:C L XuFull Text:PDF
GTID:2348330533959488Subject:Computer technology
Abstract/Summary:PDF Full Text Request
Queries submitted to an information retrieval system are often ambiguous or multi-faceted,and it is difficult for a traditional information retrieval system to identify the user's intents correctly so as to find relevant documents that can satisfy the user's information needs.In such a situation,the issue of search result diversification has attracted considerable attention as a mean to tackle query ambiguity.One of the practical solutions for search result diversification is to provide results with a wide coverage of all possible aspects of the ambiguous query,so that users with different intentions are able to find at least one highly-ranked document to satisfy their information needs.A typical strategy for search result diversification is a two-stage process:(1)using a traditional search engine to obtain a ranked list of documents,in which relevance is the only concern;(2)re-ranking the resultant list obtained in the first step so as to promote diversity.In recent years,some researchers have investigated how to use data fusion to improve search result diversity.Corresponding to the two stages of search result diversification,we may apply data fusion at either of these two stages.Corresponding to them,we may have two strategies,or early fusion and late fusion,for data fusion.In this dissertation,we mainly focus on how to use data fusion methods to support search result diversification,and the above two different types of fusion strategies are investigated and compared.The main works of this dissertation are as follows:(1)For the late fusion strategy,we focus on the linear combination method,and explore suitable strategies of weight assignment for results diversification.Corresponding to the idea of the evolutionary algorithm to search the optimal solution,we propose a weight allocation strategy that is based on the differential evolution algorithm.This weighting scheme represents each group of weights for fusion as a weight vector,or an individual in the population of the differential evolution algorithm.Through continuous selection and elimination of individuals,the algorithm will produce an optimal individual which can make the performance of the linear combination method better.(2)As a necessary step for the explicit search result diversification methods,score normalization is used for transforming the relevance-related scores or the rankings of alist of ranked documents into reasonable probabilities.We investigate the impact of six different score normalization methods on the performance of three typical explicit diversification methods which provides some evidence for the selection of the score normalization methods before diversification.(3)Aiming at mitigating the deficiency of late fusion,we proposed the early fusion strategy to support search result diversification.Component results for early fusion fuses only concerns relevance,and the fused results need re-ranking for diversification.The whole process includes fusion once and re-ranking once,Therefore,it is more efficient than the early fusion strategy.Our investigation also finds that early fusion is as good as late fusion on effectiveness.
Keywords/Search Tags:search result diversification, data fusion, score normalization, linear combination, weight assignment
PDF Full Text Request
Related items