Font Size: a A A

Research On The Pagerank Algorithm Of Page Rank Based On Web Content And Time Feedback

Posted on:2013-09-16Degree:MasterType:Thesis
Country:ChinaCandidate:Z Y LiFull Text:PDF
GTID:2248330395477153Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Nowadays search engine has become one of the most important ways for people to getinformation with the rapid development of the Internet. But now search engine faces ahuge challenge for numerous webpage information and the habit of most people who justpay attention to the first few pages when they use it to browse the webpage. It is veryimportant for the search engine to get valuable information from the Internet quickly andaccurately. Therefore, the webpage ranking algorithm of the search engine will become oneof the key problems which people focus on it.This paper based on the search engine’s webpage ranking algorithm as the researchbackground to analyze the existing webpage ranking algorithm; and to analyze theadvantages and disadvantages of several classical algorithm, such as HITS which is basedon link analysis, PageRank algorithm and the improved algorithm of PageRank.The improved algorithm of PageRank which is based on the content of webpage and feed back of time is been proposed. The improved algorithm has different semantic features forthe words appear in different places in the document and the different lengths of the words.Based on the traditional TF-IDF function to increase the position and length weight of theword, and get the whole weight after that, then using the vector space model to calculatethe similarity weight of the same feature item between the linked webpages. Introduce thetime feedback factor, consider using the number of cycles to represent the length ofwebpage existing time. Measure the size of the PageRank value when transfer betweenwebpages by improving these two aspects.Here build a simulation experiment system to verify the function and efficiency of theimproved algorithm. Carry on a series of experiment analysis and summary in this system,the result of experiment shows that the PageRank algorithm which is proposed in thispaper is effective. Through the statistics and contrast the weight of the word, it proved thatthe improved algorithm is much more accurate than traditional algorithm when theyexpress the importance of the word in the webpage. From the actual search results, itproved that the improved algorithm is much better than traditional algorithm such as HITSand PageRank which are based on link analysis to enhance the accuracy rate and recallrate.
Keywords/Search Tags:Webpage ranking, PageRank algorithm, Vector space model, Time feedback
PDF Full Text Request
Related items