Font Size: a A A

Research Of A New Algorithm For Web Structure Mining

Posted on:2011-04-19Degree:MasterType:Thesis
Country:ChinaCandidate:W F LiuFull Text:PDF
GTID:2178330332488374Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
Web data mining is the combination of data mining technology and application of Internet research, and it has become the focus of the field of data mining research. Web structure mining is a very important aspect of Web data mining, it has the classic algorithm of the HITS algorithm and the PageRank algorithm. While these two algorithms have achieved some success, but there are also some shortcomings, such as the topic drift.In this thesis, on the basis of depth research and analysis of the classical Web structure mining algorithms HITS and PageRank, against to some of the inadequacies of the two classical algorithms, proposes a new algorithm—ANWSMA that set of hyperlinks, hyperlink weight and the time of weight. First, the algorithm get digraph using the ideas of the structure-based assembly of the HITS algorithm, and then replace the damping factor of the PageRank algorithm as time weight, give different Hyperlink weight to the web page according to the degree of the importance of the web page, to calculate the value of web rank and sorted out.Finally, its rationality and availability has been verified through simulation experiments and comparison with classical algorithm.
Keywords/Search Tags:Web Structure Mining, PageRank, HITS, Time Weight, ANWSMA
PDF Full Text Request
Related items