Font Size: a A A

Based Web Search Engine Systems Design And Implementation

Posted on:2012-04-13Degree:MasterType:Thesis
Country:ChinaCandidate:S LianFull Text:PDF
GTID:2208330335497475Subject:Software engineering
Abstract/Summary:PDF Full Text Request
In order to adapt to the rapid growth of network information, and can quickly and easily obtain useful information from the network, search engine gradually into people's lives, "zhuzhu" search engine system in such conditions, came into being.This article first introduces the system concept of search engines, history, and search engine category. Then, on the "zhuzhu"; search engine system needs analysis, system design, the various functional modules and system detailed design and implementation, and finally "zhuzhu"; search engine system was tested."zhuzhu"; search engine is based on Web-oriented brand of notebook computer search engine. Front-end system to MVC pattern to achieve, Spring middle tier to do, JDBC for back-end to develop to achieve. The system is divided into three modules, capture module features are:Mass on the web pages crawled into the system; the module to complete the Heritrix web crawler. Processing module to achieve the following functions:parsing web pages, extract useful content for web pages, thesaurus, because the brand name notebook computer does not exist in the current lexicon, so to establish its unique thesaurus files, parsing web pages The information generated file segmentation and indexing, the index into the database; The module Lucene's API to achieve the construction of cable content using the API HTMLParser realize the analytical content. The main function of the user module is:the user module is the system's user interface; completion of this module users interact with the system, when users enter the query interface to retrieve the brand message, the system will be at an acceptable period of time, return the user sets the desired result; The module encapsulates DWR AJAX technology to handle user requests; by Lucene's API to achieve retrieval.
Keywords/Search Tags:search engine, Lucene, Heritrix, MVC
PDF Full Text Request
Related items