Font Size: a A A

Research Of Dynamic Replication Strategy For Cloud Computing Based On HDFS

Posted on:2016-02-10Degree:MasterType:Thesis
Country:ChinaCandidate:B ChenFull Text:PDF
GTID:2308330467473361Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the rapid development of information technology, intensive computing andlarge-scale storage are increasingly needed. Because of its strong computation ability, lowprice, easy to access and strong scalability, Cloud computing has been widely used and alsobecomes a hot spot research. The replication management strategy is one of the key techniquesof cloud computing system. It can guarantee the high reliability of cloud storage. However, alot of cloud computing based on Hadoop in some ways still exist deficiencies:First of all, the default HDFS replication strategy is static. If a great number of access tosome certain information in a short time, these objects become "hot files". The hot issue willreduce access speed and also influence the system read performance. Secondly, the replicationresource scheduling is lack of standard. There is no clear standard in the calculation of thenumber of replicas currently. The system usually creates a new replica when the system isneeded. While in the replication deletion strategy,the replicas which have existed for muchmore time are chosen to be deleted. This strategy is lack of theoretical basis. Finally, someexisting replication placement algorithm does not consider the heterogeneity of nodes.Everynode is equally treated, which is not accurate. In addition, there are some other problems, forexample: time complexity of those algorithms is too high or the model is too simple.According to the above problems, this paper has done the following work on the basis ofprevious research:(1) For the default HDFS static replication strategy, a large number of access requestswill reduce the quality of service. A central controller, two balance timers, a history accessrecord stack and an access cache are added to the native HDFS file system. Then, theimproved system can adjust the number of replicas dynamically.(2)In order to solve the lacking of standard of the replication scheduling issue, this papercalculates replicas number according to the replica access frequency. The access frequencyand average frequency determine the number of replicas. For the replication deletion strategy,the poorest performance replica will be deleted according to the impact of server node, frame and function module.(3)For the replication placement strategy, this paper proposes a SRMD replicationplacement algorithm based on the typical three layers data center network structure. In SRMDalgorithm, the computation server, computer frame, function module and network nodedistance are taken into consideration. Then we choose the best performance nodes to placereplicas. Cluster environment simulation software CloudSim3.0is used to simulate a complexnetwork with the cloud. The experimental result which contrasts to the other two strategiesshows that the dynamic replication strategy proposed in this paper is correct and feasible.
Keywords/Search Tags:Cloud Computing, Improved HDFS System, Dynamic ReplicationPlacement Strategy, Load Balancing Strategy
PDF Full Text Request
Related items