Font Size: a A A

Research On Dynamic Replica Creation And Consistency Strategy In Hybrid Cloud Environment

Posted on:2018-07-11Degree:MasterType:Thesis
Country:ChinaCandidate:Y H ZhaoFull Text:PDF
GTID:2428330596954791Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
The data stored in the hybrid cloud generally has the characteristics of huge amount of data,various types and low value density nowadays.Distributed storage technology is the basic technology of hybrid cloud platform.When the data is stored in the hybrid cloud,it is usually divided into blocks and stored separately because of the limited capacity of the node and the huge amount of file data.At the same time,in order to improve the fault tolerance of the system,backup mechanism of the data blocks is widely used.So that even if a data node fails in the hybrid cloud business server cluster,the user can still obtain copies from other normal data nodes.But the cost of storage space will increase along with the increasing of the replicas.Moreover,due to the difference of interest,some users only pay attention to some files or partial data blocks of the file.Therefore,the current distributed file storage systems such as GFS(Google File System)and HDFS(Hadoop Distributed File System)will cause huge waste of storage resources because of the use of static replica creation strategy.Consequently how to realize the high reliability and low cost data copy dynamic creation in the hybrid cloud according to the laws of the user access to the file and the change of the heat of the file data blocks becomes an urgent problem.In addition,the replicas are distributed on the multiple nodes of the data center and multiple users share data resources in the hybrid cloud.A copy can be created and deleted dynamically at any time,and the user may also read or modify the data resources at the same time,which can lead to the inconsistency of the replicas in the hybrid cloud.Accordingly it is also a key problem to keep the data consistency efficiently in the hybrid cloud.In view of the above problems,this thesis is divided into the following three aspects:(1)In order to realize the reliability of data storage,this thesis first initialized the number of file replicas.Because of the different requirements of users for the reliability of different files,the initial value of the file copy is different.Secondly,the read and write frequency of file is different due to the differences of user interest in different periods of time,even in the same file or the same file of different data blocks.Combined with the load of nodes and with the limited storage capacity of private cloud and the cost in the communication between public cloud and private cloud as constraints,this thesis proposed a dynamic replica creation strategy based on the data block heat and the node load for the sake of maximum reliability and minimum cost.Finally,the artificial immune algorithm is used to approximate the optimal number of copies of the data block in order to achieve the optimization of reliability and cost.(2)Aiming at the problem of data inconsistency caused by adding or deleting nodes dynamically and modifying the same data at the same time,this thesis proposed a data consistency strategy based on fast-paxos algorithm.This strategy discuss the leader election process in the fast-paxos algorithm,which takes into account the node transmission speed,the CPU performance of the node,the available memory space of the node,and the reading performance of the node as the node weight.To solve the problem that the judgment matrix in the analytic hierarchy process is influenced by human beings when calculated node weight,the judgment matrix is modified by using the fuzzy comprehensive evaluation mechanism,and then the consistency of the judgment matrix is tested by the accelerated genetic algorithm.Meanwhile,according to the weight of each node,this thesis designed node vote message queue based on priority and proposed a Leader node election algorithm based on the weight in order to improve the processing performance of the Leader node and achieve efficient data consistency maintenance.(3)In this thesis,the proposed optimization method is verified by experiments.This thesis makes a comparison experiment between the dynamic replica creation algorithm based on data block heat and node load and the HDFS default policy,Aurora replica creation algorithm,and the experimental results show that the proposed algorithm can effectively reduce the disk storage space occupied by the copy of the data block,and obtain satisfactory results in terms of the average response time and the average cost of payment.At the same time,by comparing the fast-paxos algorithm based on Leader election with ZAB algorithm in ZooKeeper and S-Paxos algorithm and analyzing the performance of the algorithm in terms of the number of requests,the size of different nodes and the failure of nodes in the experiment,and the results show that the proposed algorithm has advantages over other algorithms in terms of consistency maintenance time and throughput.
Keywords/Search Tags:Hybrid Cloud, Dynamic Replica Creation, Data Block Heat, Fault Tolerance, Consistency
PDF Full Text Request
Related items