Research On Semi-supervised Ranking Algorithms

Posted on:2015-03-02

Degree:Master

Type:Thesis

Country:China

Candidate:Z G Miao

Full Text:PDF

GTID:2268330428999841

Subject:Computer software and theory

Abstract/Summary:

PDF Full Text Request

Learning to rank is a hot research topic in the field of information retrieval and machine learning, and has found its applications in many problems such as document retrieval, collaborative filtering, natural language parsing. The goal of learning to rank is to automatically learn a ranking model from the training data using machine learning techniques. Progress has been made in developing different algorithms for the ranking problem. Depending on the input representation and loss function, these algorithms can be divided into three categories:pointwise approach, pairwise approach and listwise approach.Learning to rank is an instance of supervised learning, and therefore a labeled train-ing set is needed for training. However, in practical applications, it is a time consuming and expensive task to obtain labeled data. In order to exploiting the large amount of unlabeled data, it is natural to consider the problem of semi-supervised ranking. Us-ing semi-supervised learning techniques to exploit the implicit information from a large amount of unlabeled data will help to reduce the labeling costs, and improve the per-formance of the ranking algorithm. Hence, our paper aims to develop semi-supervised ranking algorithms for the task of learning to rank. The main achievements of our work are as follows.First, we proposed the general framework of regularized boosting for semi-supervise ranking. Based on the regularized boosting framework, we developed a semi-supervised ranking algorithm based on RankBoost. Regularization is a widely used semi-supervised learning technique which forces the learner to exploit unlabeled data by introducing extra regularization penalty to the usual loss function. Boosting is a simple and ef-ficient ensemble learning method with theoretical justifications, it obtains a better-performanced model by linearly combines a weak model iteratively. Combining these two important technologies, we adapt the supervised ranking algorithm RankBoost to the semi-supervised setting. Specifically, we introduce the regularization penalty term which embodies the smooth assumption form semi-supervised learning to ensure sim-ilar examples have similar rank scores to augment the traditional loss. Furthermore, we derive iterative training procedure to optimize the loss function based on boosting method. The algorithm designed has both make reasonable use of the semi-supervised assumption and retains the advantages of simple and efficient of boosting method.Second, we proposed a general framework to extend listwise ranking algorithms to the semi-supervised setting. Under this framework, the algorithm will first label some unlabeled examples according to some rules, then the traditional listwise algorithm is performed on the augmented dataset. Sepcifically, we extended one of the state-of-the-art listwise ranking algorithm AdaRank to the semi-supervised setting. The algorithm makes use of the label propagation algorithm to label unlabeled ones. Benefit from the advantages of listwise approach, the designed algorithm will highly improve the performance of semi-supervised ranking algorithms.At last, the comparison experimental results on publicly available dataset Letor with the existing semi-supervised ranking algorithm show the feasibility of proposed framework and the effectiveness of corresponding algorithms.

Keywords/Search Tags:

Learning to rank, Semi-supervised learning, Regularization, Boosting, RankBoost, AdaRank

PDF Full Text Request

Related items

1	The Research Of Semi-supervised Learning Based On Boosting
2	Online Semi-Supervised Learning Theory,Algorithms And Applications
3	Robust Semi-Supervised Multi-Label Learning By Triple Low-Rank Regularization With Application To Automatic Image Annotation
4	Research On Semi-Supervised Learning-to-rank Algorithm Based On Low-Rank Graph
5	Semi-Supervised Learning With Multiple Views
6	Distributed Semi-supervised Learning Algorithms Based On Manifold Regularization
7	Research On Semi-Supervised Multi-Task Learning Based On Regularization
8	Research On The Application Of Geometric Information In The Semi-supervised Learning
9	Research On Image Classification Algorithm Based On Semi-supervised Learning
10	Research On Graph-Based Semi-Supervised Learning And Its Applications