Font Size: a A A

Dstributed Data Search Based On Lucene

Posted on:2018-03-29Degree:MasterType:Thesis
Country:ChinaCandidate:Y YangFull Text:PDF
GTID:2348330512988983Subject:Engineering
Abstract/Summary:PDF Full Text Request
With the development of information and Internet,the Internet information are proliferating,produced by a large amount of data accurate find the required information to us caused great difficulties,search engine in this situation arises at the historic moment,it enhances our to the accurate positioning information,that we can more easily find the information we want.And too much data in the storage caused a lot of difficulties,a computer's memory is limited,we need more to store huge amounts of data generated by the computer,this is the distributed storage of information.The information stored in the distributed system,a single centralized search engines want to retrieve the contents of the distributed system is difficult,and the retrieval efficiency is very low,we will see how to solve this difficulty also distributed search engine,made us all distributed in each computer has a single,centralized search engine,this will make it so that all the nodes in a distributed parallel retrieval of data,what we need and then the results summary.This not only improves efficiency,but also makes it easier to manage.In this subject based on Lucene open source toolkit implemented a full text search for distributed data search engine,within the local area network to realize the distributed index of document and data retrieval,completed a structured document and each type of document analysis and for monitoring computer documents,on the basis of document indexing updated in real time.The features implemented in this article are as follows:1.All storage nodes in the distributed system to implement the traversal,for each type of document format in a computer parse and index;2.After the index for file storage nodes of all monitoring,once have a document changes in the system,automatically update the index,to ensure the correctness of the index and retrieval in real time;3.To implement the domain retrieval of the data,the fields can be retrieved separately;4.Implement distributed storage and distributed parallel retrieval of indexes within the LAN client.In the two types of nodes of this system,the client stores the data and index information,the service side is responsible for client management and all kinds of messages in the system.In the system all client can be found within the local area network(LAN)on all related document retrieval and download.
Keywords/Search Tags:Lucene full-text retrieval, document parsing, distributed, structured data
PDF Full Text Request
Related items