Font Size: a A A

Design And Implementation Of Network Spider Based On Topic

Posted on:2015-06-24Degree:MasterType:Thesis
Country:ChinaCandidate:C Y ZhengFull Text:PDF
GTID:2278330461999687Subject:Software engineering
Abstract/Summary:PDF Full Text Request
As the constantly changing of web information,it’s becoming more and more difficult for search engine to provide a high-quality,comprehensive and timely updated information searching service to user. The basic limitation is that it attempts to index all the web information and services to all topics inquiries request. In contrast,topic-based search engine only covers specific topic related web information, so that its content can be deeper and its updating cycle can be shorter. Also it can meet the requirements of fast and accurate to information resources. At present, topic-based web search engine is becoming a hot research and development object of computer science and information industry.Firstly, this paper describes the present status of search engine development, then this paper designs each module and the overall architecture of the topic-based spider by studying on general spider.And this paper organizes three chapters to describe the analysis, designation, and implementation of three major modules: the Website downloading module, the page preprocessing module and topic filter modules. Finally,this paper discusses the future work of topic-based spider and technologies needed further study in summary.。...
Keywords/Search Tags:Topic-based search engine, Web spider, Correlation calculation
PDF Full Text Request
Related items