Font Size: a A A

The Study Of Web Search Algorithms

Posted on:2008-08-01Degree:MasterType:Thesis
Country:ChinaCandidate:L ZhangFull Text:PDF
GTID:2178360212492500Subject:Operational Research and Cybernetics
Abstract/Summary:PDF Full Text Request
PageRank has been widely used to measure the importance of web pages based on their interconnections in the web graph. Mathematically speaking, PageRank can be explained using a Markov random walk model, in which the direct out-links of a page contribute to the transition probability of this page. Finally we use the stationary distribution as the measurement.I propose improving the PageRank algorithm by looking N-step forward when constructing the transition probability matrix, The motivation comes from the similar "looking N-step forward" strategy that is successfully used in computer chess. In particular, we assume that if the random surfer knows the N-step out-links of each web page, he/she can make a better decision on choosing which page to navigate for the next time. It is clear that the classical PageRank algorithm is a special case of our proposed N-step PageRank method. Experimental results on the dataset of TREC Web track show that our proposed algorithm can boost the search accuracy of classical PageRank by more than 15% in terms of mean average precision.Both of the classic PageRank algorithm and N-step PageRank algorithm use the stationary distribution as the measurement of the web pages' importance, but it costs a lot of time to compute the stationary distribution. In this paper I proposed Weightedlndegree Algorithm which is another simple link analysis algorithm. Different from PageRank Algorithm, it ranks the web pages with their weighted indegrees directly. Experimental results on the dataset of TREC Web track show that the new algorithm can boost the search accuracy of classical PageRank by more than 15% in terms of mean average precision, as the same time reduce the complexity greatly.
Keywords/Search Tags:Web Search, Link Analysis, PageRank algorithm, N-step PageRank Algorithm, WeightedIndegree Algorithm
PDF Full Text Request
Related items