Font Size: a A A

Research On Web Search Ranking Based On Semi-supervised Learning

Posted on:2020-12-05Degree:MasterType:Thesis
Country:ChinaCandidate:M Q LiFull Text:PDF
GTID:2428330590474470Subject:Software engineering
Abstract/Summary:PDF Full Text Request
In recent years,search engines have played a very important role in people's daily lives,and people rely on search engines to retrieve a wide variety of information.Web search ranking is a sub-module of the search engine.It is designed to rank a large number of related web pages returned by the search engine,so that the web pages with high relevance to the query content are ranked first,and the users can click the top web pages and get satisfactory results.Therefore,it is especially important to optimize the ranking of web search in search engines.It can reduce the time for checking useful information and get satisfactory results.This paper mainly studies the web search ranking task,aiming to improve the ranking results on the three ranking indicators such as NDCG,ERR,Q-measure,etc.with a limited number of labeled query-document pairs and a large number of unlabeled queries-documents.This paper has conducted the following three researches on the ranking of web search:(1)Learning to Rank is a statistical learning method.Since it can combine multiple document features and deeply understand the semantic relationship of documents,it is widely used in web search ranking task.This paper manually extracts some feature vectors to represent documents.It is said that different feature combinations are combined to distinguish the influence of click features,and then different ranking models such as Ranking SVM,RankNet,LambdaMART,etc.are compared.This paper finds that the click feature is very helpful for improving the ranking result,and the overall performance of LambdaMART is optimal in the experimental model.(2)Different representation methods of the document have certain influence on the ranking result.This paper explores different query-document representation methods and compares the performance of their ranking models.This paper knows that using the deep learning method to represent the representation of the document is better than the manual extraction feature representation method,which can extract the implicit information of the documents and deeply understand the semantic relationship of the documents.In addition,the hierarchical LSTM model based on the attention mechanism is better.It assigns different weights to different words and sentences.Considering the importance of predicting words,it can better understand the semantic relationship between words and sentences.(3)For this task,there are few label data,and it will cost a lot to label the data,and it is easy to obtain unlabeled data.This paper applies semi-supervised learning method to web search ranking.In this paper,the comparative experiments of different semi-supervised learning methods are carried out,which proves that the semi-supervised learning method is helpful for improving the ranking result.The method of co-training has good stability and excellent performance.
Keywords/Search Tags:Web Ranking, Learning to Rank, Document Representations, Semi-supervised Learning
PDF Full Text Request
Related items