Font Size: a A A

Based On The Research Of Parallel Computing Framework Of YARN

Posted on:2016-01-20Degree:MasterType:Thesis
Country:ChinaCandidate:M M ZhuFull Text:PDF
GTID:2308330470976679Subject:Computer technology
Abstract/Summary:PDF Full Text Request
The core of the Hadoop framework of the Apache framework is the MapReduce programming model and the HDFS distributed file system. MapReduce provides the parallel computation for the massive data, while HDFS provides the storage for the massive data.MapReduce is a kind of parallel programming model, it is mainly used for parallel computing of huge amounts of data sets. In initially launched a few years,this kind of parallel programming model has achieved many successful cases, in the industry also has been widely support and affirmation, but as the scale of the distributed system cluster growth and a surge in other working load, the original framework of problems are gradually surfaced. The MapReduce programming model needs to make large-scale adjustments to its memory consumption, scalability, thread model, reliability and performance flaws in the existing mechanisms. Over the past few years, Hadoop team has done some bug fixes, but with the cost of the repair is growing, this shows that the original framework to make changes more and more difficult. So the open source Apache organization in order to promote the Hadoop framework to go farther, fundamentally solve the key problems affecting the performance of MapReduce, starting with version 0.23.0, perfect reconstruction of old MapReduce framework and on the structure occurred fundamental changes. Apache open source organization after the reconstruction of the MapReduce framework named Hadoop 2 or called YARN.In this paper, the MapReduce programming ideas, working principle, specific steps and methods are described in detail. Then, detailed expounds YARN programming model and YARN framework, working principle, the concrete steps and methods. And YARN is compared with MapReduce, The deficiency and shortcomings of MapReduce were studied, and Outlines the differences of the YRAN and MapReduce.Finally, through constructing the Hadoop cluster environment, and then based on the framework of yarn were MapReduce parallel computational experiments, through experiments proved that based on the yarn under the framework of parallel computational efficiency and reliability.
Keywords/Search Tags:Hadoop, MapReduce, YARN, ID3, Parallel computing
PDF Full Text Request
Related items