Font Size: a A A

Research And Application Of Full-Text Retrieval System Based On Lucene

Posted on:2011-05-06Degree:MasterType:Thesis
Country:ChinaCandidate:L YueFull Text:PDF
GTID:2178360302991170Subject:Information Science
Abstract/Summary:PDF Full Text Request
With the rapid increase of specialized website information resources,much attention has been given to the problem that how to fetch relevant information efficiently that people concern.Full-text search is a key technology for the solution of this problem.After researching the technology of full-text search, a full-text retrieval system of distance education based on Lucene is designed, which ensures rapid information enquiries for users through topic search.Firstly,based on the design idea of topic search engine, architecture and function modules of the system are analyzed.Secondly,web capture module is implemented, making use of Heritrix.Then,the files such as PDF of distance education are analyzed and arranged, kinds of text extraction tools are integrated,the module of text extraction is built.The existing maximum matching method of Chinese word segmentation is improved, and also the customized Chinese analyzer is implemented.Meanwhile,search retrieval ordering of Lucene is modified, and importance of the page is added to enhance the relevance of sorting result.Finally,experimental situation is built to perform necessary test of system. The experimental result shows that the precision of the words segmentation and the relevance of result list are increased by using improved Chinese word segmentation and sort algorithm with calculations of the importance of the page.It meets the system requirements with combination of the segmentation and sort algorithm.
Keywords/Search Tags:Full-Text Search, Lucene, Chinese segment, Search Retrieval Ordering, SpringMVC
PDF Full Text Request
Related items