Font Size: a A A

The Research And Design On Topic-specific Search Engine Based On Lucene

Posted on:2008-06-16Degree:MasterType:Thesis
Country:ChinaCandidate:H JiangFull Text:PDF
GTID:2178360212990637Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
With the rapid growth of information technology, Internet is expended extremely these years. Nowadays, Internet has become an important information resource with enormous scales. However, Information Fortress and Information Overload have become an increasingly serious problem. Various Internet-based information search engine emerged and has been developing rapidly . Although Google, Yahoo and other search engines are very powerful, when they use the information to search for certain professional, general search engines ,they have some inadequacies.The topic-specific search engine is a kind of precise classification, renewal prompt search engine. With the Internet information detonation growth as well as the information multiplication development, the topic-specific search engine is becoming the research hot spot and the trend of development.The thesis referred to the key technology of topic-specific search engine, present a topic-specific search engine solutions based on Lucene, then research and implement a job help-info search engine.The content of the thesis can be described as following:First,the thisis decribes the background ,development and characteristics of search engine , reviews the history of search engine,compares with the difference between general search engine and topic-specific search engine,embodies the professional advantage of topic-specific search engine.Second, referring to the three modules of the searcg engine key technology : information capture, indexing and retrieval,the thesis does an in-depth analysis and research. Researches for the difference between topic robot and general robot,and for the search strategy.The thesis focus on a package of Java-based full-text indexing engine Lucene, compares with the difference of traditional database and Lucene, reflects the use of Lucene for indexing and retrieval of high efficiency and accuracy.Third, based on the above key technologies, the design of job help-info search engine is presented, including system design and technology strategy, architecture, and the development environment.Then,the design was brought into practice.Introduces the specific process of job help-info search engine based on Lucene detailly.At last,the thesis gave a summary of the dissertation work, and pointed out thedirection of future development and the further work to constantly update andimprove.The job help-info search engine ensures the complete information and updatingbetimes,can avoids strong search noises, and improves the efficiency of searching, canprovide access to special information inquiries quickly,completely,and accurately. The main contribution of this thesis can be included as following: 1. Analyzes the key technologies of general search engine and topic-specificsearch engine.2. Analyzes the Apache Lucene full-text search engine for the tool kit deeply,compares with the differences between Lucene retrieval and traditional database retrieval,and introduces the Chinese word segmentation techniques.3. In the analysis of the topic-specfic search engine on the basis of key technologies, design a job help-info search engine based on Lucene .4. Analyzes the key technologies used in Design and Implementation , and the analysis is the foundation for the expansion and redevelopment.
Keywords/Search Tags:topic-specific Search Engine, Lucene, information indexing
PDF Full Text Request
Related items