Font Size: a A A

Research And Implementation Of Vertical Search Engine

Posted on:2013-12-02Degree:MasterType:Thesis
Country:ChinaCandidate:Y Z BaiFull Text:PDF
GTID:2248330371964530Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
Accompanied by the sharp growth of information on the Internet, general search engine failed to satisfy specific user’s query requirements which is specialized, sophisticated and deeper, in this case vertical search engine came into our view. The so-called vertical search engine technology is used for a particular field or a certain profession specialized retrieval. It extends and expands the general search engines. In contrast to the general search engine, vertical search engine search further, update information quickly and its information is more accurate.This paper is organized around the vertical search engine for various technical points, mainly including the following aspects:(1) The overview of vertical search engine. First, this portion of the content introduce different kinds of vertical search engine both from the form of realization and granularity of search results and puts forward the evaluation index; Then the three most important technique, focused crawler, full-text indexing and search based on key words, are introduced; Finally, other assistive technologies, such as the Chinese segmentation, web page purification, were described.(2) The study and implementation of focused crawler. The main goal of focused crawler is to be as efficient as possible capture resources associated with specified subject. After the study of existed focused crawler, one based on probabilistic model is proposed, which can resolve“theme drift”and“tunnel”issues. The quality of webpage is also guaranteed. Finally, through the comparison of comprehensive experiment, shows the feasibility of focused crawler based on probabilistic model.(3) The research of query expansion technique. The query expansion technique can resolve the defects brought by keyword matching query technology, and improve the recall ratio and precision ratio. This paper presents a method based on Tongyici Cilin local query expansion. This method can overcome the problem of simple local expansion method in some cases have poor effect. Finally, through the comparison experiment of these three methods, the effectiveness of the method based on Tongyici Cilin local query expansion is proved.(4) The realization of vertical search engine prototype system. First, The part introduces the full-text retrieval toolkit, Lucene technology, including the system structure of Lucene, the indexing and search mechanism and the assessment mechanism; Then put forward a vertical search engine system called VSE, which is based on Lucene. The architecture of VSE is discussed; Finally, combined with the VSE system specific code, the main function modules of VSE were explained and elaborate.
Keywords/Search Tags:vertical search, focused crawler, probabilistic model, query expansion, Tongyici Cilin, Lucene
PDF Full Text Request
Related items