Font Size: a A A

Research On Learning To Rank Algorithm Based On Feature Selection

Posted on:2019-09-06Degree:MasterType:Thesis
Country:ChinaCandidate:S L ChenFull Text:PDF
GTID:2428330548467880Subject:Computer technology
Abstract/Summary:PDF Full Text Request
A large amount of data will be generated every day on the Internet.The cumulative amount of data is up to a number of trillions of web pages.Users get pertinent information through the retrieval tools.The use of a specific computer is necessary in order to collect information from the Internet according to a certain strategy,and then complete the organization and processing of information.In the end,the related query service can be provided.As a new research field,learning to rank(LtR)has been shown to be able to effectively handle users query data sorting by machine learning techniques.However,large search systems must respond quickly to user queries,and the computation of the feature of the candidate documents must conform to strict back-end delay constraints.At present,the search engine companies,represented by Google,have hundreds of features to be taken into account when they do web pages ranking.If all features are utilized to build a ranking learning model,the sorting efficiency will be greatly reduced.So,the number of features must be given a definite limit,which can satisfy more and more search time and retrieve the contents.It is everything the more necessary to find a feature subset satisfying delay demand by feature selection,so that the trained model is highly effective.Based on this,the main contents of this paper are as follows:First,we give the research background,research status and research ideas at home and abroad.This paper studies the origin of learning to rank,classifies and describes the learning to rank algorithms summarily,analyzes the model framework for information retrieval and learning to rank,and then introduces the basic algorithms used in this paper,and then studies the basic process of feature selection,common framework,basic classification and evaluation criteria.Secondly,feature selection,as an effective method of data reduction,contains many advantages,which can identify the most applicable subset of the related features on the training set,and this subset can be utilized to learn the model of the original task.From this point of view,this paper explores the application of feature selection in learning to rank.Taking full advantage of the characteristics of hierarchical clustering,the two ones are combined.First,two filtering fast feature selection algorithms are improved from the point of view of initial point selection,and a modern fleet feature selection framework is proposed.Experiments on the standard datasets demonstrate the effectiveness of the proposed algorithm,that is,it can either obtain higher ranking accuracy on smaller subsets or obtain the best ranking accuracy on an intermediate subset.Then,in order to address the problem of useless features in rapid learning,a wrapper feature selection algorithm is proposed.In the algorithm,two criteria of feature selection areconsidered,which not only narrow the loss of the target function,but also decrease the overall similarity between the two features in the feature subset.In the smelly task,the similarity between two features is measured by Pearson correlation coefficient,and the similarity term is added to the punishment term of loss function,and the principal features are selected by the forward to back the greedy algorithm.The experiment shows that by optimizing the loss function,the algorithm reduces the similarity between the two features,and can choose the most prominent feature.The algorithm can obtain the ranking accuracy higher than the filter type on the smaller feature subset,and then the better results are obtained compared with the parallel algorithms.Finally,a summary of the full text is made,the main content of this paper is summarized,and some thoughts about the feature selection in learning to rank are given,and the research direction and research content of the next step are put forward,and the future research trend is described.
Keywords/Search Tags:Information Retrieve, Learning to Rank, Feature Selection, Hierarchical Clustering, Greedy Algorithm
PDF Full Text Request
Related items