Font Size: a A A

Reserch On The Algorithms Of Web Structure Mining

Posted on:2013-01-01Degree:MasterType:Thesis
Country:ChinaCandidate:X B FuFull Text:PDF
GTID:2248330371966713Subject:Communication and Information System
Abstract/Summary:PDF Full Text Request
Along with the development of information, the users can acquire all kinds of information conveniently.At the same time,they also face the problem that how to get relevant and useful information from Web. Web date Mining can solve this problem.There are two classical algorithm:Pagerank and Hyperlink-Induced Topic Search. PageRank only considers the link relationship among web pages and ignores web page itself.We know that for the same page it has different yalue when it belongs to different site.For example, people will pay more attention for the page which in Ministry of Education than in BUPT’s.We should endue with this pages more authority. Hyperlink-Induced Topic Search algorithm will bring lots of irrelative pages,when base-set spread from root-set.The irrelative pages can redue efficiency and affect. The paper analied the Web Mining in theory, improved Pagerank algorithm and Hyperlink-Induced Topic Search algorithm.,Also introduced Web Mining. This researeh is supported by experiment.This paper’s main work is as follows:1. Describe the PageRank algorithm and HITS algorithm, point out the defects.2. Improved PageRank algorithm’s quality.when evaluate one page’s weight, put the site’s authority as the evaluation of a weight.3. Improved HITS algorithm’s efficiency.keep the frequent pages when root-set expand base-set.4. The real data of experiment Justified the improved algorithm.
Keywords/Search Tags:Web structure-Mining, PageRank, HITS algorithm, site’s authority
PDF Full Text Request
Related items