Font Size: a A A

The Research Based On The Search Engine Page Rank Algorithm

Posted on:2012-04-26Degree:MasterType:Thesis
Country:ChinaCandidate:Y B LiFull Text:PDF
GTID:2178330335499744Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
With the fast development of computer technology, the search engine technology comes into being as well. People are always hoping to find the most relevant and authoritative homepage through the search engine. But the quality of the search engine homepage sort algorithm is directly related to whether the user can find such a homepage. Therefore, the search engine homepage sort algorithm already becomes one of the technologies measuring whether the commercial search engine possesses core competitiveness.This article studies the development and operational principle of the search engine, analyzes the remarkable influence that the homepage sort algorithm exerts on the search engine performance, and at the same time deeply does research on currently existing classical PageRank algorithm and the HITS algorithm as well as the related worldwide improved algorithm. Through absorbing the relavant thoughts from these algorithms as well as analyzing the shortcoming in themselves, we finally put forward a new improved algorithm SPR.The existing homepage sort algorithm is mainly based on the homepage link structure analysis; the most representative algorithmare PageRank and the HITS algorithm. Sorting algorithm based on the Web link only resolves the authoritative problem of a web page, which might very well lead to the query topic drift. This article settles the topic drift problem from the course of the relativity of the page content. The method of PageRank's distributing weighting equally is unreasonable. For this reason, this article proposes a new way which is in accordance with the homepage popularity to distribute weighting equally. Synthesized from the two aspects of homepage link structure and the homepage content in order to improve the classical PageRank algorithm, the new SPR algorithm is obtained. This algorithm model has solved the authority problem from the link structure, and is more reasonable than the classical PageRank algorithm in transmitting homepage weight. Meanwhile, it also solves the topic relevant problem from the course of the homepage content and weakened the topic drift phenomenon.Finally through the method of constituting a search engine simulator which is separately combined with the classical PageRank algorithm and the improvement SPR algorithm, we can obtain the search result. Designing an evaluation standard and then evaluating the experimental results resulting from the two methods, the improvement SPR algorithm will be affirmed.
Keywords/Search Tags:search engine, page rank, HITS, PageRank, SPR algorithm
PDF Full Text Request
Related items