Font Size: a A A

Design And Implementation Of University Topic Crawler Based On BP Network

Posted on:2010-01-25Degree:MasterType:Thesis
Country:ChinaCandidate:B HanFull Text:PDF
GTID:2178360275488978Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
Web contains a lot of rich and useful resources.Search engines has become an important tool for the retrieval of such resources.However, with the growth of various information, the number of web pages has exceeded 2000 million.Traditional search engine always returns too many results and has a low topic relevance. Meantime it is difficult to satisfy the need of man's ever-increasing need of personalized service. This also puts forward an unprecedented challenge for the general search engine. Based on this, the topic search engine used in specific areas and specific groups comes into being.The topic crawler is the foundation and core of topic search engine. It is built on the basis of general crawler. It is the extension of general crawler in function. This paper studies the relevant technology of the topic crawler and establishes a kind of university topic crawler. It intends to look for more resources relevant to areas of colleges and universities.BP artificial neural network is an multi-level network which makes weight training using non-linear and differential weight function. It contains the most part of the essence in the theory of neural network. Because of its simple structure, plasticity, it has been widely used in pattern recognition, information classification and the fields of data compression. Especially due to the cleanness of mathematical meaning, the distinct steps of learning algorithm, its background of application has been more extensive.This paper mainly describes the design and implementation of the university topic crawler. It is important how to make a judgement of the topic relevance of web pages. This paper will apply Bp technology which solves non-linear problem well to the university topic crawler so as to predict the topic relevance. It conducts the crawler to gather information relevant to university resources. Its classification is more precise and reasonable than linear classification with a good fault tolerance.The experimental results show that the design of the university topic crawler has achieved good results, with high practical value. It has a higher precision than the crawler which judges the topic relevance in vector space model.
Keywords/Search Tags:Search engine, Relevance, Personality, Topic crawler, Artifical neural network
PDF Full Text Request
Related items