Font Size: a A A

Research And Application Of Focusing Crawler Which Faced Vertical Search Engine

Posted on:2009-06-04Degree:MasterType:Thesis
Country:ChinaCandidate:H LvFull Text:PDF
GTID:2178360242973836Subject:Software engineering
Abstract/Summary:PDF Full Text Request
With the development of Search Engine, serves in the specific domain vertical search application starts to emerge. As a vertical search technique which pays great attention in specialized and the structure analysis, Its premise is the establishment in and above the structurized data information which relate to the subject. So, how to accurate and promptly gain the structurized data information has become a current vertical search area research big topic.Crawler as a Search Engine information source tenderer who can automatic extract the Hyperlinks on the pages, and download the information from the web. But in structurized data information gain aspect, it's not been able to meet the Vertical Search Engine's needs. So, This article proposes using Focusing Crawler which Faced Vertical Search Engine to solve the above problem.After simple introduction Vertical Search and Crawler's technical background, regarding face the center of Focusing Crawler which Faced Vertical Search Engine, this article has completed the following main research and the application work:1) Described systematically the concept of Focusing Crawler which Faced Vertical Search Engine, its prime task principle and flow, key technologies analysis, also has discussed its trend of development.2) In view of two big and most basic work link of Focusing Crawler: page capture and information extraction, the advocate uses for reference overseas advanced open-source project technology: Heritrix Crawler and Web-Harvest tools. Also made the technical upholstery of the following application.3) In existing research foundation, introduces an actual position employment advertise vertical search engine project, union a concrete case stand (i.e."ZhiLian"website) application demand. Standardized design and realization a Focusing Crawler System which can solve the structurized data information gain problem in the project. This system has the good extendibility and the modifiability, and has the good practical application value. This article's innovation mainly lay in the utilizing some open-source project reasonably, displayed their respective special skill, and gave out a practical solution of Focusing Crawler which Faced Vertical Search Engine.
Keywords/Search Tags:Vertical Search Engine, Focusing Crawler, gain the structurized data information, Heritrix, Web-Harvest
PDF Full Text Request
Related items