Font Size: a A A

The Design And Implementation Of Vertical Search Engine Framework

Posted on:2012-10-31Degree:MasterType:Thesis
Country:ChinaCandidate:J XuFull Text:PDF
GTID:2248330395955393Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
The World Wide Web is growing exponentially and the dynamic, unstructurednature of the web makes it difficult to locate useful resources. Web Search engines suchas Google and Baidu provide huge amount of information many of which might not berelevant to the users query. So general-purpose search engines can no longer satisfy theneeds of most users searching for specific information on a given topic.A vertical search engine searches a specific industry, topic, type of content (e.g.,travel, movies, images, blogs, and live events), piece of data, geographical location, andso on. Some of this content cannot be found, or is difficult to find, on general searchengines. For this reason, the topic of vertical search is closely related to that of the deepWeb.After discussing the history of development of search engine, analyzing thecharacteristics of vertical search engine and the function, principle and technologystrategy of various parts, this paper presents a new kind framework of vertical searchengine, which bases on the model of target data. Pre-select topic seed sites, specify therules of judging topic URL and extracting data from pages, this solves the problem ofthe relevance between the topic and pages. The same time, it can largely improve theefficiency of crawling and update pages from web because of removing of download ofnon-related pages. It is clear that we can not crawl all information about the topic bychoosing related web sites as our crawling target. So this framework designs a simpleand easy strategy of adding new web sites. This can improve coverage of targetresources. At last this paper implemented the framework into a vertical search enginesystem which related to science and technology projects and intellectual property.
Keywords/Search Tags:Vertical Search Engine, Web Spider, Web Structure, Full-text Indexing
PDF Full Text Request
Related items