Font Size: a A A

The Research On Intelligent Web Information Retrieval

Posted on:2005-10-27Degree:MasterType:Thesis
Country:ChinaCandidate:W HanFull Text:PDF
GTID:2168360122992313Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
With the increasing of WWW, Web information retrieval systems with higher performance are required. Subsequently, the research on Web information retrieval has being a focus. Recently, Focus Crawling system was presented to satisfy people who need professional knowledge from WWW.In this dissertation all key aspects of a Focus Crawling system are introduced and then the classification problem in Focus Crawling system is deeply discussed. Now, most classification methods for Web Page only use the contents of Web Page. These methods ignore links between pages completely. In fact, links between Web Pages sometimes reflect topics of these linked pages. So this dissertation designs a new method to classify Web Pages. This method uses links and contents of Web Page to decide a page's class. The result of experiment shows an improvement on methods, which consider contents of Web Page only. Then this dissertation designs a better Focus Crawling system, which use a classifier based on contents and links of a Web Page to decide the page's class, and the result of experiments shows an improvement on common method.In order to check our methods, we develop a focus crawling system using vc++ 6.0.
Keywords/Search Tags:Internet, World Wide Web, Information Retrieval system, Search Engine, Classification Algorithm
PDF Full Text Request
Related items