With the rapid increase of specialized website information resources,much attention has been given to the problem that how to fetch relevant information efficiently that people concern.Full-text search is a key technology for the solution of this problem.After researching the technology of full-text search, a full-text retrieval system of distance education based on Lucene is designed, which ensures rapid information enquiries for users through topic search.Firstly,based on the design idea of topic search engine, architecture and function modules of the system are analyzed.Secondly,web capture module is implemented, making use of Heritrix.Then,the files such as PDF of distance education are analyzed and arranged, kinds of text extraction tools are integrated,the module of text extraction is built.The existing maximum matching method of Chinese word segmentation is improved, and also the customized Chinese analyzer is implemented.Meanwhile,search retrieval ordering of Lucene is modified, and importance of the page is added to enhance the relevance of sorting result.Finally,experimental situation is built to perform necessary test of system. The experimental result shows that the precision of the words segmentation and the relevance of result list are increased by using improved Chinese word segmentation and sort algorithm with calculations of the importance of the page.It meets the system requirements with combination of the segmentation and sort algorithm. |