Font Size: a A A

Topic-Relative Web Site Ranking Based On Combination Of Contents And Links

Posted on:2009-02-01Degree:MasterType:Thesis
Country:ChinaCandidate:Z B ZhengFull Text:PDF
GTID:2178360242982975Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
The rapid growth of the Internet brings many challenges of unprecedented scale for general-purpose search engines. The results that general-purpose search engines provide are too exhaustive. Thousands of irrelative results obviously don't meet precise needs of searching. Therefore, Vertical Search Engineer which provides service in a single field emerged. Of course, in recent years, the website, one of the most important organizational structures of the Web, has played a more and more important role in the web search and mining applications. So the topic-relative rank of websites has been an essential technique in many web applications, such as Vertical Web search and focused crawler.As we all know, link topology can be used to identify the important pages. Further, in this thesis, when rank's value of a page is transferred, we will take contents of the linked pages into account. This thesis introduces a topic-relative random surfer model, and then elaborates on the topic-relative PageRank based on combination of contents and links. It can be proved that this algorithm can prevent "Theme Drift" phenomenon effectively.Since the number of web pages is much larger than the number of websites, it is not feasible to calculate the topic-relative rank's values of websites by calculating the sum of the topic-relative PageRank value. In order to tackle this problem, we proposed a novel method named topic-relative AggregateRank rank of websites based on combination of contents and links, which cannot only approximate the sum of PageRank accurately, but also have a lower computational complexity than PageRankSum.Finally the system design is introduced, including the system structure and method. Since analyzing fifty websites about the chindhealth subject that fetched by the system, we find that the method proposed in the thesis meets can meet effectiveness and efficiency in the same time.
Keywords/Search Tags:Topic-Relative, Websites Ranking, Content Analysis
PDF Full Text Request
Related items