Font Size: a A A

Algorithm Research Of Webs’ Learning To Rank

Posted on:2014-11-10Degree:MasterType:Thesis
Country:ChinaCandidate:H DengFull Text:PDF
GTID:2268330422464568Subject:Computational Mathematics
Abstract/Summary:PDF Full Text Request
High-end advanced information technology brought us into a vast digital age. The influx of a large amount of data makes search engine become more and more important. How to quickly locate needed information from huge amounts of data is very critical. Search engines contain more than one component and the web page scheduling part which determines the search engine sequencing results and directly affects the performance of search engines and user experience is the core of search engine design problem. In information retrieval, there are many web page sorting algorithms, which can be roughly classified as the point-wise approach, the pair-wise approach and the list-wise approach. Researchers have made great contributions to using a variety of algorithms in these three kinds of methods and the web learning research is still in the climax stage.In view of the web learning scheduling problem, Firstly, we modeled the web learning scheduling models based on Support Vector Machine theory in point-wise and pair-wise approach respectively. Based on the idea of cross validation to choose the parameters of the SVM models and the selection of kernel function is analyzed. Normalization and fractal feature dates’ visualization analysis got done in the preprocessing part. In pair-wise model, this article got the training sample in the random sequence matching method. Secondly, based on the use of heuristic method we have established the sorting genetic algorithm to optimize the BP neural network learning model. The model used the optimization ability of genetic algorithm to get good initial weights of BP network and threshold in order to improve the performance of BP network. The principal component analysis was used to compress the training data and to guarantee high dimension data fidelity as well as to make appropriate degree of BP network structure. Thirdly, we designed a boosting page sorting model which aimed at studying strong collator’s promotion performance on the basis of weak learning ability. Experiments special for the three models on OHSUMED data set were conducted, result analysis and algorithm comparisons done next. The experimental results show that pair-wise approach model performed slightly better than point-wise approach model and the use of genetic algorithm to optimize BP network’s weights and threshold can improve the sort precision of the models but with high time consumption and the boosting method can obtain good effect but also at the cost of high time consumption.
Keywords/Search Tags:Learning to rank, SVM, Genetic algorithm, BP network, Boosting, Crossvalidation
PDF Full Text Request
Related items