Font Size: a A A

Replica Placement Strategy Research In MapReduce Cluster

Posted on:2015-03-10Degree:MasterType:Thesis
Country:ChinaCandidate:M Y QianFull Text:PDF
GTID:2268330425984730Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
In heterogeneous environments, nodes in a cluster have different performance due to their various hardware configurations. Hadoop, the most widely used MapReduce implementation, does not sufficiently take heterogeneous environments into consideration. In heterogeneous environments, the load of nodes is not appropriate to their performance, and many map tasks launched are not data-local, which results in severe performance degradation.In this thesis, we propose a novel replica placement strategy for heterogeneous environments, which improves the performance of MapReduce in heterogeneous environments. The main contributions of this thesis are followed.1) We analyze the issues causing the performance degradation of MapReduce in heterogeneous environments.2) We propose the replica creation strategy based on the performance capabilities of the nodes.3) We propose the dynamic adjustment strategy based on the performance capabilities of the nodes, which adjusts the number of replicas of the files in the cluster and balances the load of various nodes based on the visit history and the given extra storage budget.4) The experiment shows that the proportion of data-local map tasks is increased, the response time of MapReduce jobs is decreased and the load of nodes is more appropriate to their performance through using the replica placement strategy proposed.
Keywords/Search Tags:MapReduce, Heterogeneous-environments, Block-replication-placement
PDF Full Text Request
Related items