Font Size: a A A

The Research And Realization Of Tour Guide Information Vertical Search System

Posted on:2010-08-30Degree:MasterType:Thesis
Country:ChinaCandidate:H ZhangFull Text:PDF
GTID:2178360278466364Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the great increase of internet information, using the traditional search engine will generate a lot of useless information to users because it collects web pages regardless of their differences. In order to get the area-related information that people cared more precisely and more quickly, the technology of vertical search engine is developed. The vertical search engine just collects the pages that people assigned and special theme related.Nutch is a newly developed open-resource web search engine. It uses Lucene for the index and search module. The work flow includes network pages collection, pretreatment and search modules. It has the same work flow as commercial search engine and independent function modules. We can make a vertical search engine quickly by reconstruct the related function modules.This paper aims at build a tour guide search and management web site based on vertical search engine technology. The most critical issues are Chinese segment algorism and theme relativity. This paper uses Nutch to build the basic search engine and alternates two important modules of Nutch: the pages collection module and the Chinese parsing module.On this basis, this paper firstly discusses the deference between general search engine, and then the core technologies of general search engine and deep study of vertical search engine, and then the principle of Nutch. The paper uses shark -search algorism for the webpage collection, then the theme judgment. In the Chinese segment this paper using a new dictionary mechanism and a new Chinese segment algorism.Later, this paper discusses in detail the overall design procedure of tour guide information search engine including the functions of the website, the architecture, the database design and realization of all the algorisms and functions.Last are the summary and some advices in the future.
Keywords/Search Tags:Vertical search engine, Nutch, Chinese Segment, theme relativity, spider
PDF Full Text Request
Related items