Font Size: a A A

Design And Implementation Of A Vertical Search System

Posted on:2013-03-31Degree:MasterType:Thesis
Country:ChinaCandidate:H LeiFull Text:PDF
GTID:2248330362963668Subject:Software engineering
Abstract/Summary:PDF Full Text Request
Currently, people can use the traditional general search engine to find horizontalmassive information, the advantage of this kind of engine can provide muchcomprehensive information to user, but as the scope of the information it provided toowide, so it is hard to consider the accuracy of the data. When the user needs a certaindomain or industry information, the general search engine often cannot meet therequirements better, at this case, we can use domain-oriented vertical search engine,which based on a specific area or industry and the information is processed depth,provide to users with more accurate information.This paper research and analyses of the vertical search engine key technology, thendesign and implement a tablet vertical search system based on the user’s demand on tosearching the tablet. The paper first analyzes the core technology of the vertical searchengine, such as the theme crawler, information extraction and full-text retrieval and soon, especially presents the inverted index and the Lucene-An open source full-textretrieval tool kit. Then we focus on the Chinese word segmentation,which is anotherkey technology, including Chinese word segmentation methods and algorithms. Basedon string match method, the paper first establish a fundamental main dictionary of thetablet computer domain, then use the maximum matching algorithm based on prefixword, finally design and implement a Chinese word segmentation module for tablet area,and implemented the Lucene analyzer interface. Through the compare to the other wordsegmentation system, the results show that in the tablet area the component of Chineseword segmentation implemented in this paper is more accuracy.Based on the key theory and technology, the paper designs the system as generally, including functional module division, the architecture, the technology and theenvironment of development. Finally we discuss the design and implementation processof the system in detailed, by using UML design and analysis technology and the J2EEthree layer architecture methodology, the paper presents the whole process. By thecomparison of use this system and the traditional search engine such as baidu, soso etc.to search information on the tablet area, the results indicates that the systemimplemented in this paper is much more accurate.
Keywords/Search Tags:Vertical Search, Theme Crawler, Chinese Word Segmentation, Full-textRetrieval, Lucene
PDF Full Text Request
Related items