Top-k Learning To Rank Based On Similarity Between Documents

Posted on:2015-08-07

Degree:Master

Type:Thesis

Country:China

Candidate:X Xiao

Full Text:PDF

GTID:2298330422990413

Subject:Computer Science and Technology

Abstract/Summary:

Now, with peopleâ€™s growing demand for information on the Internet, to get theinformation accurate and efficient from search engine has become a hot researchissue. Around those above, sort technology has become a vital part in search engine.In order to make customersâ€™ satisfaction improved, it is certainly necessary toimprove the accuracy of the results returned, in another word, the most relevant pageshould be returned to the user. How to achieve this goal will become a researchfocus on the search engines, in recent years the most popular way is to focus thesearch engines sorting process uses machine learning methods to study and solve,which is due to factors that affect the sort results very complex, these factors aretaken into account is bound to get a more reasonable sort results. It is the method ofLearning to Rank.In practical applications such as information retrieval, recommendation systemor the calculation of advertising, for most users, the main concern is compared bysome results in front, that is to say, by some results for the user experience andsatisfaction, these front results are crucial. Thus, a Top-k ranking method isproposed to solve the requirements above.We improved the model based on previous model, first we added the correlationinformation between documents, thus, the model considering the relevance betweenthe documents in the process of Top-k data modeling and the documents are notmutually independent, but related. We will correlation between documents as eachdocument scoring weighted to scoring, after adds the documents correlation. It canmake use of some additional information and the results can be improved.When the correlation between the documents added the new model obtained, theproposed method directly using the maximum probability of sort to optimize themodel does not use minimizing the loss function, the result of this is that thecomputation of training model is greatly reduced, from combination level topolynomial level.Then we attempted the innovative Top-k model since the original Top-k modelhas a lot of documents should be at first k position but they are wrong at the positionafter k in the first layer of the process. So no matter how the second layer processalgorithm complicated, or the use of additional information, the sorting effect onoverall Top-k does not improve, based on this. We increased K of first layer in theprocess, but k still small compared to the related document collection N, so thedocuments really in first k positions increased, and when we use complex algorithms for sorting at second layers, the accuracy is greatly improved.

Keywords/Search Tags:

machine learning, learning to rank, vector space model, similaritybetween documents, NDCG

Related items

1	Research On Information Retrieval Model Based On Learning To Rank
2	Research On Learning To Rank For Information Retrieval
3	Study Of XML Documents Clustering In Web Mining Domain
4	Learning to rank documents with support vector machines via active learning
5	Research On Machine Learning Algorithm With Environmental Data Prediction
6	A Research And Optimization Of Learning To Rank Based Personalized Recommendation Algorithms
7	Research On Machine Learning Algorithm For Recommendation System
8	Rank Optimization For Person Re-identification Through Intelligent Machine Learning Techniques
9	Research On Some Problesm Of Support Vector Machine Learing Algorithm
10	Research On Top Rank Learning And Its Applications