Research On Training Learning To Rank Algorithm With Heterogeneous Data

Posted on:2017-03-29

Degree:Master

Type:Thesis

Country:China

Candidate:T Zhang

Full Text:PDF

GTID:2308330485980611

Subject:Computer software and theory

Abstract/Summary:

PDF Full Text Request

Considering that the ranked data is limited and the classified data is infinite and easy to get, we define a new situation th at use both the ranked and classifi ed data as training data to train the ranking algorithm.We propose an algorithm framework of trai ning learing to rank algorithm with heterogeneous data, use both classified and ranked data to train text ranker. In this framework, classified and ranked data is transformed into preference between the pairs of data points, like the Pairwise algorithm that transforms the l earning to rank problem into classification on preference between the pair of data points. So, we can modify pairwise algorithm to solve the problem that training algorithm with heterogeneous data.We use digraph to describe preference intuitively. The Learning to Pairwise rank algorithms are based on the prefer ence between the pair of data and the classified data also contains the preference between positive class and negative class. So, in this situation, we add classified data to ranked data to get m ore preference information, in order to im prove the performance of learning to rank algorithm of Pairwise type.We transform the standard dataset to sim ulate the real situation. In the experiment, we use both ranked and classified data in a gi ven proportion to train RankSVM algorithm which can be used in the new situation, a nd transform the MQ2007, MQ2008 and OHSUMED dataset into heterogeneous dataset. By comparing the performance of algorithm that only uses ranked data as training data with the algorithm that uses heterogeneous data, we can illustrate the expected improvement.The result of the experim ent shows that on dataset OHSUMED the heterogeneous data can improve the algorithm performance on MAP by 12.4% and on NDCG by 22.8%. On dataset MQ2007, MQ2008, the improvement is not so significant.

Keywords/Search Tags:

Learning to Rank, Information Retrieval, RankSVM, Pairwise

PDF Full Text Request

Related items

1	Research On Learning To Rank For Information Retrieval
2	Chinese Webpage Feature Extraction In Learning To Rank Algorithms
3	Researches On Information Retrieval Model Based On The Algorithm Of Learning To Rank
4	Research Of Learning To Rank In Information Retrieval
5	Research On Information Retrieval Model Based On Learning To Rank
6	Research Of Learning To Rank In Information Retrieval
7	Research Of Learning To Rank Based On Directly Optimizing Evaluation Measures In Information Retrieval
8	Learning To Rank For Biomedical Information Retrieval
9	Theoretical Analysis On Direct Optimization Of Information Retrieval Measures In Learning To Rank
10	Research On Learning To Rank Algorithm For Information Retrieval