Font Size: a A A

The Design And Implementation Of Search Engine Based On Lucene

Posted on:2015-03-13Degree:MasterType:Thesis
Country:ChinaCandidate:Y ZhangFull Text:PDF
GTID:2268330428490973Subject:Computer technology
Abstract/Summary:PDF Full Text Request
The unification publishing platform based on Web is a new type of managementstyle, assemble the advantages such as convenient, fast and efficient. It makes full useof the information technology, improve management efficiency, reduces the userwork intensity, improve the efficiency of information transfer speed, and accuracy.The unified publishing platform has lots of functions, search engine is one of theimportant part implementation in this article. The fast booming of the Internet today,there are various different kinds of information. Enjoying the conveniences broughtabout by the Internet, at the same time facing the problem how to accurately, quicklyand easily find them in such a large-scale environment information, the Internet hasbecome the focus of the search.In this article, search engine, including applications, data and search threeimportant component. In the source file upload, based on the in-depth research, andsearch and retrieval of information structure and working process of the principle,analyze them step by step. Based on the JSP technology support, design andimplement a flexible, simple user interface, design and implement a reusable,extensible index establishment and management subsystem using open source Luceneengine architecture and do preliminary optimization of the site map. The system cancreate and manage index, search multiple sources and upload a file of the source files,and other functions, is of certain applicability.Full-text search engine can be divided into two kinds: one kind web scrapingretrieval system itself, there is a separate "spider" program, or reptiles procedures, or"robot" procedures, can directly search web from its self-built database, called aself-built web search engine, Google, baidu falls into this category, the other is to rentthe other search engine database, and then arranged in a custom format, such as Lycos.There are other types of search engines, such as indexes, the meta search engine.Their representatives include Yahoo, Sina, InfoSpace, Dogpile, Vivisimo.2006years later, he gradually rise a new search engine is called a vertical search engine. It isdifferent from other search engines, is that it focused on specific search and searchdemand (for example: flight search, travel search, search, search, video search, etc.)of the novel, life, in their specific search field has a better user experience. Comparedwith general search at thousands of retrieval, vertical search need specific hardwarecost is low, the user needs, query way of diversity, is its advantage.The design of the system is based on Lucene, the web unification publishingplatform is a new type of management style, assemble the advantages such asconvenient, fast and efficient. It makes full use of the information technology,improve management efficiency, reduces the user work intensity, improve theefficiency of information transfer speed, and accuracy. Web unification publishingplatform, therefore, the research and implementation of has become a research areathat attracts people extremely.Unified information platform provides full text search service, unified platformhas full text search, create and manage the information of the application of index andsearch function. The system fully used the design of system framework, separate thedata and function, improve the reusability and interoperability of the system. Userinterface subsystem is developed based on JSP technology, effective and fast forsearching task.
Keywords/Search Tags:Search Engine, Chinese Word Segmentation, Index
PDF Full Text Request
Related items