Font Size: a A A

On The Study Of Ranking Algorithms In Relation To Search Engine

Posted on:2012-10-17Degree:MasterType:Thesis
Country:ChinaCandidate:K ChenFull Text:PDF
GTID:2178330335452292Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
With the development of the Internet, the network has become an important source of information. However, it becomes more difficult to find information quickly and accurately because of the rapid increase of network information. The search engine is developed to solve this problem. The search engine is a growing technology including a series of technical aspects such as web crawling, segmentation, page indexing, data storage, search, results ranking. The search results ranking is one of the most important aspects. What users are concerned about most is the accordance between the search results and the themes they inquire and whether the results are helpful for them. Therefore, the major concern of current search engine development is whether it can put the most relevant search results page first.Learning the development history of search engine, the author studies its overall framework and analyzes the main function of each part of the search engine system and the entire work flow, and define the evaluation criterion of the search engine. On that basis, the paper focuses on the ranking algorithms of search engine. It analyzes the fundamental principles and implementation methods of two different ranking algorithms including one based on the content of web pages and one based on page link analysis. This paper focuses on the later algorithm through the study of ideas, basic process, merits and demerits and modified methods of the three classical ranking algorithms including PageRank, HITS and HillTop.To overcome the shortcomings of the ranking algorithm based on link analysis, this paper introduces the concept of user feedback and analyzes the methods of using it in search engines, and proposes the improved method basing on PageRank algorithm. The improved PageRank algorithm adds the weight of user clicks and click-time feedback, and the weight of pages content according to the idea of ranking algorithm based on content to improve the formula of calculating the value of PR.Through the experiments of verifying the improved algorithm, analyzing and comparing the result pages, the author proves that the improved PageRank algorithm can solve the problems such as topic drift, web spoofing and laying particular stress on old pages and put the most helpful result pages first in order to improve the quality of search results.
Keywords/Search Tags:search engine, ranking algorithm, PageRank, HITS, HillTop
PDF Full Text Request
Related items