Research On Learning To Rank Algorithm For Information Retrieval

Posted on:2019-02-16

Degree:Master

Type:Thesis

Country:China

Candidate:D L Wang

Full Text:PDF

GTID:2348330545998789

Subject:Software engineering

Abstract/Summary:

PDF Full Text Request

Learning to rank is a technology which combine machine learning and information retrieval.This method is a study between classification and regression,it is to solve the questions about ranking.The so called learning to rank is which use a set of tagged data and machine learning techniques automatically generates a ranking model that meets the user's requirements.Since learning to rank is widely used in document retrieval,recommendation system and other fields,learning to rank in recent years has attracted more and more researchers' attention.First,this thesis introduces the basic concepts of learning to rank,then introduces three groups of learning methods:pointwise method based on a single sample,pairwise method based on a pair sample of the document and the listwise based on the sample list of documents.In this thesis,the representative algorithms of three methods are introduced,and the traditional learning to rank algorithms are analyzed and the advantages and disadvantages are summarized.Then this thesis introduces some ranking algorithms based on direct optimization of rank measure,considering that such algorithms can get dependable ranking model,they often do not work well in some real situations,such as high dimension and large-scale.In this way,two more efficient learning to rank algorithms are proposed.Specifically,the thesis mainly includes the following aspects:(1)The most existing work on learning to rank assumes that the training data is clean,but due to the ambiguity of query intent,the lack of domain knowledge,and the vague definition of relevance levels,all make it difficult for common annotators to give reliable relevance labels to some documents.As a result,the relevance labels in the training data of learning to rank usually contain noise.If ranking model ignores this fact,the performance of learning to rank algorithms will be damaged.In view of this problem,an algorithm based on multi-objective learning to rank algorithm is proposed.The algorithm firstly defines three objectives:the gain function,the risk function and the number of samples,and the three targets are optimized by the NSGA-II evolutionary algorithm.Then,a Hybridization-encouraged Mechanism selection strategy was proposed.And using this strategy can further improve the performance of the algorithm.Experiments on the benchmark data set verify the effectiveness of the proposed algorithm.(2)Compared with the existing works,which mainly focus on the accuracy of the model,it is more challenging to design the ranking model with sparsity.To address the issue,in this paper,a mini-batch stochastic gradient method for sparse ranking was presented.The algorithm begins with formulating sparse learning to rank as a mini-batch based convex optimization problem with L1 regularization.Then for the problem that simple adding L1 term does not necessarily induce the sparsity,FTRL(Follow-The-Regularized-Leader)method is adopted for inner optimization,which can obtain good solution with high sparsity.A feature-wise learning rates updating strategy is also suggested,and with this strategy,the performance of the proposed algorithm can be further improved.Empirical studies on LETOR data sets confirm the significant advantage of the algorithm in comparison with several baseline methods.

Keywords/Search Tags:

Learning to rank, Noise, NSGA-?, Sparse Model, FTRL, Mini-batch

PDF Full Text Request

Related items

1	Research On Model Learning Based On Sparse And Low-rank Constraints
2	Sparse And Low Rank Theory And Their Applications
3	A Study On Sparse And Low-rank Machine Learning Problems
4	Structured Sparse And Low-Rank Representation Learning Method And Application Research
5	Sparse And Low-rank Theory Based Feature Selection Algorithms
6	Research On Image Reconstruction Algorithm Based On Sparse Representation And Low Rank Model
7	Research On Constrained Low Rank Representation Algorithm And Its Applications
8	Research On Wmsn Video Denoising Based On Improved Model Of Low-rank Decomposition
9	Research And Development Of Manufacture Management Information System For Single And Mini-Batch Production
10	Person Re-Identification Based On Visual Saliency Enhancement And Structured Low-Rank And Sparse Learning