Font Size: a A A

Design And Implementation Of A Focused Search Engine

Posted on:2005-04-14Degree:MasterType:Thesis
Country:ChinaCandidate:B YaoFull Text:PDF
GTID:2168360125952977Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
As more information becomes available on the World Wide Web, it becomes more difficult to provide effective search tools for information retrieval. Due to the current large size and dynamic of the Web, universal search engines can only crawl and index a portion of the Web. So universal search engines can no longer provide a comprehensive and up-to-date search service of the Web. In contrast to universal search engines which attempt to index the whole Web, focused search engines can cover specialized topics in more depth and keep the crawl more fresh. Focused search engines use rich contexts and good crawling strategy to guide the navigation of links with the goal of efficiently locating highly relevant target pages. The design and implementation of a focused search engine is going through a highly creative phase. A lot of machine learning work is being applied to the task.In the thesis, we survey the state-of-the-art of focused search engines and study the related techniques for building a focused search engine. Based on the study, we discuss a number of the focused crawling algorithms or strategies that are representative of the dominant varieties published in the literature. Meanwhile, we describe the design and implementation of a focused search engine, i.e. MySE V1. 0 in details. The MySE is a system-oriented effort to integrate a suite of techniques into an information retrieval tool. By the tool, we will test the value of our or others ideas on focused crawling strategies.
Keywords/Search Tags:Web Information Retrieval, Focused Search Engines, Focused Crawling Strategy, Machine Learning, Design Pattern
PDF Full Text Request
Related items