The Research On Learning To Rank Algorithm Based On Topic Similarity

Posted on:2017-03-19

Degree:Master

Type:Thesis

Country:China

Candidate:Y Liu

Full Text:PDF

GTID:2308330485471016

Subject:Computer Science and Technology

Abstract/Summary:

PDF Full Text Request

The occurrences of the search engines greatly improved the efficiency of access to information. But how to sort the information that users care about most and need most to the front from the massive search results is one of the core issues to search engine research. The position optimization for the pages in top position of the search results has considerable research value and commercial value.Learning to rank is a method that use machine learning algorithms to solve the problem of documents sorting, it is useful for document retrieval, collaborative filtering, and many other applications. Based on the research of existing Listwise methods of learning to rank, this article proposed a new method that use the text similarity between documents to improve the scoring function of the original algorithm and further improve the sorting performance of the model. The major contributions of this article include the following three aspects:1) This article proposes a new method for the approach. Specifically, it introduces text similarity between documents as a new metric, which extended the scoring function from query-documents scoring to use documents similarity voting each other. The new metric takes full advantage of the inherent correlation between documents and the characteristics associated with the text, which use a more general and comprehensive perspective to consider the issues of search sorting problems and finally resulting in a more reasonable sort results.2) This article proposed a new model which combined VSM and LDA models to measure the similarity between the text from words and theme. The combination of the two models compensate for their own shortcomings, and improve the calculation results.3) The results show that, The ListSimi algorithm compared with the existing algorithms, performance has been improved on data set OHSUMED and TD2003. ListSimi can significantly improve the accuracy of existing learning to rank algorithms, especially for the front of the documents list. It is crucial for a commercial search engine that returns correct top pages, because the quality of the top pages a search engine returned directly affects the user’s search experience and satisfaction.

Keywords/Search Tags:

Information retrieval, Learning to rank, Topic model, Text similarity

PDF Full Text Request

Related items

1	Study On Learning To Rank And Query Reformulation Based Information Retrieval Model
2	Researches On Information Retrieval Model Based On The Algorithm Of Learning To Rank
3	Improved Text Topic Representation And Learning Method
4	Research On Information Retrieval Model Based On Learning To Rank
5	Research On Short Text Topic Information Mining Technology
6	Text Retrieval Based On Real-time Twitter Streaming
7	A Short Texts Matching Methodusing Multi-level Features
8	The Research Of Combine Learning To Rank And Topic Model
9	The Research And Implementation Of Text Similarity Computing Based On Topic Model
10	Research On Bilingual Topic Model And Its Algorithm In Cross-language Information Retrieval