Font Size: a A A

Full-text Retrieval Of Distributed Geological Survey Data Based On Lucene

Posted on:2014-01-05Degree:MasterType:Thesis
Country:ChinaCandidate:L YeFull Text:PDF
GTID:2248330398486252Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the continued strengthening of the China Geological Survey work in theconstruction of information, through the constant accumulation of geological survey,has formed a large number of geological survey data document. These data documentof important data basis for further geological studies, is of great use to science, and anational treasure. In order to promote the progress of the geological data in socialservices, we need certain way to provide the geological data what we have to thecommunity. However, due to reasons of human, material and emphasis, thesegeological data are saved as the way of files and documents in the long-term. Andservices to the public in the traditional way of borrowing, or according to the keywords through a portal website of geological survey metadata directory queries, andnot provide the search way from the contents of the Geological Survey data, whichcan not provide more accurate and higher degree of associated geological survey datafor the average user and expert users. How to improve the management and servicelevel of geological survey, is the focus of the work of this project.According to the practical applications of geological information service platform,First, the paper discusses the basic concepts of full-text retrieval and the basicprinciples of full-text retrieval, and describes some of the techniques associated withfull-text retrieval; Secondly, has studied the full-text retrieval engine tools Lucene,elaborated Lucene’s advantages in compared to other search engines. At the sametime, has studied the Lucene architecture, construction of the index and querymechanism; Finally, has designed a retrieval system prototype based on the filedirectory of geological data, including geological survey data preprocessing module,data indexing module retrieval module and the data display module. This paperextends the Lucene Chinese word segmentation module in order to segment thegeological data more accurate. According to different types of geological data, wedesign a generic interface to parse the data and convert it into the format that Lucenecan be able to handle,just because of this,the text retrieval system of geological datacan be able to support the retrieval of a variety of common geological data documentformat.Experiments and actual use demonstrate that the geological data text retrievalsystem implemented in this articlehas greatly improved the performance of thegeological data retrieval system in sorting, recall, precision ratio and response time.
Keywords/Search Tags:Geological survey data, Full text retrieval, Lucene, Chinese words segmentation
PDF Full Text Request
Related items