Font Size: a A A

Research Of The Distributed Search Medel Based On Map/Reduce

Posted on:2015-02-27Degree:MasterType:Thesis
Country:ChinaCandidate:S C HuFull Text:PDF
GTID:2268330428961178Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
With the development of the Internet, data show geometric growth. So it will be a great problem to find out the useful information from huge data set quickly. Emergence of search technologies makes it convenient for people to get information quickly and effectively. Among search technologies, the most prominent tool is the search engine. Of course, those for specific data, such as data retrieval tools in interdisciplinary fields, can bring us great convenience as well.This article carries out research work basing on the two aspects above. Proceeding from the perspective of research and design, we make detailed discussion and analysis of theories and technologies in the field of distributed searching, and describe in detail the Map/Reduce distributed architecture technology and the technology of Lucene. Then we begin our work according to the above mentioned two sections.The main content of the thesis is as follows: Firstly, the thesis achieve the development of stand-alone and distributed search models for academic papers by solving problems and making optimization; Secondly, the thesis get improved methods of text classification and index storage for academic papers, gaining a significant increase in efficiency; Thirdly, the thesis implements the development of stand-alone and distributed search models for gene/protein sequences searching, and gives a reasonable solution to optimize the Combiner function and solve the problem of data skew; Finally, the thesis highlight the superiority of distributed model in solving big data problems by comparing the experimental data with the stand-alone one.The thesis shows us that the distributed search model has its advancement and superiority in the field of big data through designing and developing search tools and making distributed extensions, and gives proper handling and detailed answers to those problems we met. Thus, the content of this paper is of great significance.
Keywords/Search Tags:distributed searching, Map/Reduce, optimization
PDF Full Text Request
Related items