Font Size: a A A

Research And Implementation Of Vertical Search Engine

Posted on:2009-11-02Degree:MasterType:Thesis
Country:ChinaCandidate:L XiaoFull Text:PDF
GTID:2178360242966049Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
With the rapid development of Internet and www, resources on the Internet are increasingly rich. In order to help people to get useful information in a broad array of Internet information, the various Internet-based information retrieval services came into being and has been developing rapidly. Currently, people search information on the Internet primarily through Google, Baidu and other general search engines. The functions of these search engines have a very strong, under normal circumstances to meet the user's needs. However, when users just want a professional or for a specific industry, or some theme related information, such search engines will be a little insufficient. The emergence of vertical search engines is specifically for solving this problem.This paper first discusses the significance of vertical search, and then described in detail the search engine architecture, and in-depth study of the general search engines core technologies, including spider technology, Chinese word segmentation technology, website ranking technology. Then with the general search engines contrast, introduced a vertical search engine structure required critical technologies.On this basis, this paper presents two of the most important modules to build vertical search engines, namely the webpage collection module and web information extraction module, and its framework design and algorithm model.In the part of webpage collection modules, discuss the method to prevent the "theme drift" phenomenon that the vertical search engine makes efforts to solve, by means of theme judgment, theme forecast and website ranking, and provide the corresponding algorithm model in respective modules. In the structure of information extraction module, presents a structure based on XML technology information extraction system prototype. Make proper portfolio allocation of search module and information extraction module and form the core of vertical search engines, which lays a good foundation to create a complete vertical search engines.
Keywords/Search Tags:Vertical Search Engine, Topical Crawler, Information Extraction
PDF Full Text Request
Related items