Font Size: a A A

The Research Of Distributed Search Engine Based On Solr

Posted on:2013-04-25Degree:MasterType:Thesis
Country:ChinaCandidate:X S ZhangFull Text:PDF
GTID:2248330392457252Subject:Software engineering
Abstract/Summary:PDF Full Text Request
With the rapid development of small and medium-sized enterprises, as well as theincreasing popularity of computer information technology, the rapid development ofenterprise’s amount of information has grown exponentially. Business users want accurateinformation they need to find a huge mass of information the library, it is not realistic asfishing for a needle in the ocean. The search engine technology is an effective way tosolve this problem, which allows you to provide users with a relatively simple informationretrieval service. In the search engine system in order to be able to better deal with hugeamounts of data and search accuracy, use of distributed computing and the Solr full textretrieval technology.The search engine user distributed processing architecture for massive dataprocessing and high concurrent requests. Proposed a distributed search engine, the mainresearch work is distributed computing on traditional search engines. Massive dataprocessing should be distributed indexing and distributed search strategy. And distributedfile system to store the index file. And then conduct in-depth discussion on the overallprocess framework to effectively deal with massive data processing, structure andprocesses. In response to the treatment of high concurrent requests, given the software andhardware load balancing and optimization strategies of each distributed node. Loadbalancing strategy and for each distributed active nodes to optimize its performance to beable to quickly deal with the high concurrent requests. Master-slave replication clusterdeployment, and to better adapt to the huge amounts of data and the processing ofconcurrent requests and handling mechanism for Solr index.Finally, in my laboratory environment to build a small two active nodes distributedsearch engine system, where each active node cluster deployed two computers. Establishits mass index, and stress tests on the engine, the experimental data results. Understandingthrough the analysis of experimental data results, verify the stability of the system’sarchitecture, scalability and reliability.
Keywords/Search Tags:Distributed computing, Amounts of information, High concurrency, Solr search engine
PDF Full Text Request
Related items