Font Size: a A A

The Design And Implement Of Search Engine System On The Campus Networks

Posted on:2008-11-24Degree:MasterType:Thesis
Country:ChinaCandidate:L LiuFull Text:PDF
GTID:2178360212493824Subject:Software engineering
Abstract/Summary:PDF Full Text Request
With the quick development of the campus networks, information amount is in the great deal of increment. How to get the latest and most helpful information is one of the most important facets to catch opportunities and obtain the success for us. Although there have been several excellent general search engines, such as Google and baidu, they do not meet for all the requirements. For the science, government and university site, fair sort results are very substantial. In another side, the quantity of the web information is over the capability of any search engine even the most powerful one. At the same time, there are some limitations in the existing campus search engine, such as low precision and recall, hard to maintain and update etc. So this article designs a flexible, configurable, well extendable and efficient search engine for campus networks.On the research of basic conception, key technology and procedures, this article analyzes the specific requirement, and then builds up the search engine system of campus networks for Shandong university. It not only completes the whole design about the search engine frame, but also finishes most part of the development work.This article narrates the background of this system, the present development situation of search engine technology at home and abroad and then elaborates on the development process and method of this search engine. Firstly, we analyze the functional requirement and non-functional requirement of this campus search engine system. Then we put forward our goals and principle, and describe the whole function and flow of the system from functional structure and technologic structure sides. In the design of technologic structure, we design our plug-in system in order to make the system more extendible, flexible, and maintainable and make the design and development of the system easier. Meanwhile we also use Map/Reduce distributed processing model to advance the parallel processing ability of the system and reduce dependence in hardware. Thirdly, we describe the design of several modules and plug-in system in details in the period of detail design. Crawlers fetch the web information in breadth-first order. Index and search module's implementation are both based on a full-text search engine toolkit named Lucene. Next part we propose the detailed solution for key matters in the realization process. This part describes the runtime environment and user interfaces, emphasize some key problems such as word segment algorithm and link analysis algorithm. The last part analyzes the system's performance through testing. The experiment proved that the new campus search engine system is more efficient and precise than the existing system.
Keywords/Search Tags:Search Engine, Plug-in, Distribution, Web Crawler, Lucene
PDF Full Text Request
Related items