Font Size: a A A

Research On HITS Algorithm Of Web Structure Mining

Posted on:2009-06-01Degree:MasterType:Thesis
Country:ChinaCandidate:J LiuFull Text:PDF
GTID:2178360245483025Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Web is an enormous information resources bank, which provide s various kinds of information services.As the prevalence of Web a nd the quick expansion of Web information, how to acquire informa tion that we need from Web becomes more and more important.The refore, discovering valuable information from distributive Web envir onment and acquire knowledge from it has became important task o f the information research and data mining field at present.Users ho pe to get not only the relevant Web pages, but also pages searchedwith high quality, that's to say to find out authority pages.Page's h yperlink is an important method for.it, and the introduction and ap plication of hyperlink analysis provide a wholly new approach to soute those problems.The HITS(Hyper-text-Induced-Topic Search) algo rithm is a widely used authority source distilling algorithm which b ased hyperlink analysis and has high value for study.In Web structure mining, hyperlink analysis has been successfully used in analyzing the hyperlink data of Web pages to extract authoritative information sources.Among various hyperlink analysis methods, HITS algorithm is used the most widely.In the following part of this thesis, HITS algorithm is discussed and the topic drift problem of HITS algorithm is also analyzed.Then root-set eigenvector projection method and base-set downsizing method are implemented to improve the HITS algorithm.Based on projection method, weighed root-set eigenvector projection method and weighed base-set eigenvector projection method are also proposed in this thesis to make deeper improvement so that the search of authoritative Web pages can be more effectively.
Keywords/Search Tags:Web data mining, HITS algorithm, Eigenvector projection method, Downsizing method, Weighed eigenvector projection method
PDF Full Text Request
Related items