Font Size: a A A

The XML Indexing And Retrival Based On Lucene

Posted on:2013-11-09Degree:MasterType:Thesis
Country:ChinaCandidate:J XieFull Text:PDF
GTID:2248330374976340Subject:Software engineering
Abstract/Summary:PDF Full Text Request
The network information and data of the interaction of the information age formed animportant symbol, XML gets more and more wide application, while the rapid growth of thedocument data makes the renewal and the maintenance XML document into an importantsubject.In order to use XML documents well, improve efficiency and reduce the developmentand maintenance of cost, scholars are constant to XML related theories for further research,from data storage to the index, have been to query, makes XML theory and applicationobtained fast development. The problem is constant to dissolve.The good research makesmany indexing and retrieval tools. But for the increasing demand and the diversity of queryXMLdata, peoplemore andmore hope to find a general indexstructure whichmeets differentdata sources (XML, pure text,relational database and other various applicationdata, etc.) andcan effectively deal with all kinds of queries.Therefore, in view of the different XMLapplication, scholars have proposed all sorts of different set up and maintenance of the indexmethod,so as tomeet thevariousdifferent under the environment of user demand.At first, this paper introduces the full text search system, and the technique of Lucenethat the full text search tools for a study,then the existing XML technology in-depth analysisand exploration, according to the analytical and XML to XML document data source codetechnology process, using the improved Lucene for the Chinese word segmentation ofanalytical module of the data extracted word segmentation, by extending the Lucene indexmoduleis indexed, developmentachieve a full text search based on LuceneXML, a prototypesystem.This paper has improved the Lucene ‘s index of the module and the efficiency of thesecondary development. It makes implementationsimpleand easily portable to other full-textretrieval system.It is set up accurately,practical and efficient full text retrieval system basedon Lucenelaya solidfoundation.
Keywords/Search Tags:XML, Lucene, parser, Chineseword segmentation, full text search
PDF Full Text Request
Related items