Font Size: a A A

Study On The Improvement Of Hits Algorithm Of Web Structure Mining

Posted on:2009-03-16Degree:MasterType:Thesis
Country:ChinaCandidate:J Y YangFull Text:PDF
GTID:2178360242483920Subject:Applied Mathematics
Abstract/Summary:PDF Full Text Request
The high-speed developed and disorganized World Wide Web makes the Web information retrieval more and more difficult.The traditional search engines were the powerful tools for the users to retrieve information in the past,but now,they can't keep pace with the development of WWW.At first , the thesis analyses the challenges faced by Web information retrieval.And analyses the advantages and disadvantages in traditional search engines.Then it introduces several text-analysis-base methods used in traditional search engines.Hyperlink-analysis-base methods ,such as HITS and Page Rank,explore knowledge from the Web hyperlink structure,not from the Web page text. the thesis describes the process of the development of hyperlink-analysis-base methods,putting the emphasis on the HITS and works that analyze and improve it.HITS algorithm is based on the theme of an important link analysis method, but it links without distinction, which can easily lead theme drift phenomenon. Based on the analysis of the HITS algorithm proposed on the basis of relevance based on the theme of the popular Website and improve HITS algorithm, and the popular use of correlation to distinguish the importance of links. HITS algorithm with the experiment results show that: improved HITS algorithm than the original HITS algorithm, ARC algorithm, SALSA algorithm can find more related Website related increase the proportion of 30% -50%, thus greatly reducing the theme drift phenomenon, improve efficiency and quality of the enquiries.
Keywords/Search Tags:HITS algorithm, link analysis, relevance, popularity
PDF Full Text Request
Related items