Font Size: a A A

Research And Application Of Full-Text Search Engine Based On Docker Technology

Posted on:2018-02-12Degree:MasterType:Thesis
Country:ChinaCandidate:L L ZhaoFull Text:PDF
GTID:2348330536979917Subject:Software engineering
Abstract/Summary:PDF Full Text Request
Because of the rise of the third revolution in the computer word,a large number of applications of Cloud Computing and Big Data,data processing have jumped to level TB and even level PB,and have taken more rapid and more efficient process towards these data.Therefore,various Big Data processing methods and technology derived from the Cloud Computing concept have became the mainstream of this wave.Whereas as the broadest Big Data processing platform in this wave,Hadoop platform,in the aspect of Hadoop framework full-text search engine with the basis of virtualization technology,appears the advantages of stable operation,economic,easily to be managed,storage and computing.In the aspect of the construction of full-text search engine,firstly,through analyzing and summarizing the merits and demerits of some of the current distributed search engines,this dissertation submits the distributed search engine based on Hadoop platform.Secondly it analyzes the limitations of the traditional server deployment,and compare the pros and cons of the processing performances between the traditional virtualization technology and Docker container technology,thereby using Docker container as the Hadoop platform underlying architecture to construct Hadoop platform and optimize the performance of Hadoop platform.Then,the distributed search engine crawling,indexing,query three subsystems of parallel algorithms and applications of Map/Reduce,computing task with Reduce function,Map function to encapsulate the data package of data.In addition,the system uses the inverted document technology and the TF-IDE(Term frequency–inverse document frequency)and PageRank algorithm to calculate the correlation degree in the full text retrieval.At the same time,through the bottom of the Docker container can be more convenient for the deployment and transplantation of search engines.Firstly,in this paper,by comparing the experiment to verify the compared with the traditional virtual technology,Docker has great performance advantages in reading and writing.Then,the deployment scheme of Hadoop in Docker container cluster is designed and optimized.Based on the above two points,this paper constructs the full text search engine system based on the Hadoop architecture of Docker technology,and tests the performance,reliability and scalability of the system.Through the analysis of the experimental data,the rationality and correctness of the full text search engine based on the Hadoop architecture of Docker technology is verified.
Keywords/Search Tags:Hadoop, Map/Reduce, Docker, Distributed computing, Full-text searching
PDF Full Text Request
Related items