Font Size: a A A

Design And Implementation Of Focused Crawler Based On Ontology

Posted on:2009-10-29Degree:MasterType:Thesis
Country:ChinaCandidate:Z YangFull Text:PDF
GTID:2178360245471567Subject:Management Science and Engineering
Abstract/Summary:PDF Full Text Request
The Web has greatly improved access to information. With the enormous growth of the Internet, the conflict between the growth of the Web information and the ability of people achieving it is becoming huger and huger. Traditional Search Engine can't keep up with the more and more rigorous and prolific search requirements from different users. Recently, focused search engine is presented, which is better classified, containing more profound and focused data, requesting low hardware condition, and being updated in time. Focused crawler is a main component of a focused search engine. A focused crawler may be described as a crawler which returns relevant web pages on a given topic in traversing the web. search algorithms of focused crawler is the critical technology of the topic-specific search engine,which can determine a search engine's performance: the coverage of the network information resources, the searching and the network width, and so on. In this paper, the search algorithm is to study and discuss.Firstly, the basic theory of focused search engine is simply introduced in this paper. The work theory of the focused crawler is analyzed. Then several kinds of search algorithms of focused crawler are discussed. The Best-first search algorithm which is often used in focused crawler system are detailed and analyzed the tunnel problem in this algorithm. Based on these discussions, this paper proposed an ontology-based search algorithm, when it encounters a topic-irrelevant page, it will not give up crawling immediately but use Ontology in related fields to help crawler probe a direction and go through the tunnel. Finally we used this new algorithm to design and implement a focused crawler prototype system. Experiment results showed this algorithm can effectively expand the search scope and improve accuracy of the search results.
Keywords/Search Tags:focused crawler, tunnel, Ontology, Best-first Search algorithm
PDF Full Text Request
Related items