Font Size: a A A

Research Of Learning To Rank Based On Directly Optimizing Evaluation Measures In Information Retrieval

Posted on:2014-01-19Degree:MasterType:Thesis
Country:ChinaCandidate:P ZhangFull Text:PDF
GTID:2248330395999601Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
With the coming of Web2.0age, the information on the Internet has seen explosive growth. How to quickly retrieve the data that user needed from the mass of information has become the key point in information retrieval research area. Ranking plays an important role in information retrieval system. Traditional retrieval models generally could be divided into two categories:the query-dependent models which based on document content, judge the relevance between query and document, such as Boolean Model, Vector Space Model; query-independent models which based on link analysis, judge the importance of document itself, such as PageRank, HITS. These models each has its own characteristics. How to combine these models and create an even more effective new model become what the researchers focus, which leads to a new technology:Learning to rank. It employs machine learning technology to solve ranking problems in information retrieval and further enhance the ranking performance. Learning to rank algorithms generally can be divided into three categories:the pointwise approach, the pairwise approach and the listwise approach. Besides, the listwise approach could be divided into two sub-categories, the first sub-category measures the difference between the permutation by the ranking model and the ground truth permutation; the second sub-category defines the loss function based on the bound of IR evaluation measures, which is more similar to the real ranking and could generally performs better than the first two categories’. In this paper, we focus on the research of directly optimizing evaluation measures and put forward a series of learning to rank methods which further improve the ranking performance. The main achievement are as follows.At first, we proposed a learning to rank algorithm SVMERR which employs structural SVMs method to optimize the Expected Reciprocal Rank (ERR). Compared with the methods that optimizing MAP or NDCG metric, optimizing ERR metric could make more contribution to ranking performance. For ERR metric is based on cascade model which considers user browsing behavior and structural SVMs could find the global optimal solution, the combination of the two methods could improve ranking performance.Secondly, we extended the boosting based ranking algorithm AdaRank and utilized it to optimize other three evaluation metric, namely ERR, MRR and Q-measure. We expected that employing AdaRank algorithm to optimize the three evaluation measure could achieve good ranking accuracy. Learning to rank algorithms are based on features, however, there are a little research in terms of feature generation. Inspired by this, we proposed a kind of feature generation framework FGFIREM which regards the proposed ranking algorithms mentioned above as feature generation factors. Besides, we utilized the proposed ranking algorithms AdaRank-ERR and AdaRank-Q to learn ranking models on new feature sets. The experimental results in LETOR3.0indicated that the methods proposed in this paper improved the ranking performance.
Keywords/Search Tags:Information Retrieval, Learning to Rank, Directly Optimizing EvaluationMeasure, Feature Generation, Support Vector Machine
PDF Full Text Request
Related items