Font Size: a A A

Research On The Reliability Technology Of Fault-tolerant And Fault-detection-based Cloud Storage

Posted on:2016-12-15Degree:DoctorType:Dissertation
Country:ChinaCandidate:Z Y YangFull Text:PDF
GTID:1318330476455897Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
It has been a great challenge to store and process the data in the era of big data, the data centers have become bigger and bigger to meet the requirements of storing and processing the massive data. There are millions of servers in Google cloud data centers, and node failure has been normality because there are so many common servers to provide service day and night, therefore how to improve the reliability of the cloud system and ensure the quality of service when some nodes fail is worthy of further study.To solve the key issue for the construction of low redundancy and high reliability in the cloud storage system, the dissertation studies the fault-tolerant technology(replica and erasure code), failure-detection technology and high reliability storage architecture based on the enhanced data reliability theory and the system structure of massive data, the research has made some achievements as follows.(1) To solve the low efficiency in cloud storage system due to the static replica strategy, the dissertation proposes a dynamic replica strategy which consists of two portions, including the replica placement strategy and the replica number strategy. In the replica placement strategy, the nodes are divided into different groups according to the relevance, and the different groups are organized into a virtual ring. The hash algorithm is used to determine the groups to place the replicas, and the method of TOPSIS(Technique for Order Preference by Similarity to an Ideal Solution) is used to solve the problem of MADM(Multiple Attribute Decision Making) according to the performance and the load of the nodes, finally the node with high performance and low load will be selected to place the replica. The replica number is adjusted according to the heat of the file and the load of the node, the replicas will be created for the file with high-heat and high-load in the nodes, and part of the replicas will be deleted from files with low-heat and low-load in the nodes. The experiment shows that the dynamic replica strategy can raise the system performance and the replica number can be adjust adaptively.(2) To solve the lower fault-tolerant performance in the array code, the dissertation proposes an extended X code( XEX_), which is suitable to the cloud storage system for its high fault-tolerant and high storage efficiency. XEX_ code organizes the code-word into two-dimensional array of nn ?, which is coded by XOR computation according to the slope ???1/3-3/11/2-2/11-1),,,,,,( ?, and the coding redundancy is put into the different columns. The coding process is presented by using algebraic, the MDS property can be proved, and the encoding algorithm of_)3,(XEXm code is designed by “Eliminating duplication”. The XEX_ code can tolerate the failure of three disks at least, and can also set the fault-tolerant value according to the some requirement.(3) To solve the adaptability problem in the traditional failure-detection, the dissertation presents an adaptive fault-detection strategy in cloud storage based on the delay prediction, the strategy predicts the delay value by using the exponential smoothing theory according to the historical heartbeat information. The prediction value by using the exponential smoothing theory is closer to the real delay than the prediction by using average algorithm, it's more flexible by using a dynamic mean square as a correction value than a fixed correction value in the traditional failure-detector, and the detection accuracy is raised by using the dual model of heartbeat and interaction in the master-slave cloud storage system.(4) To solve the high-space overhead in replica fault-tolerant and the low access efficiency in erasure code fault-tolerant, the dissertation proposes an mixed fault-tolerant mechanism based on the replica and the erasure code, which adaptively uses different fault-tolerant way according to the file's heat: the replica fault-tolerant is used in the hot files to improve the file's access efficiency; the erasure code fault-tolerant is used in the files with few access to improve the storage efficiency; the mixed fault-tolerant is used in the popular files to balance the access efficiency and the storage efficiency.The thesis is supported by the National Natural Science Foundation(NFS) under grants(No. 61272116, No. 61472294), The Natural Science Foundation Key Project of Hubei Province(No.2014DFA050), The Application Foundation Research Plan of Wuhan City(No.2015010101010021).
Keywords/Search Tags:cloud storage, data fault-tolerant, replica, erasure code, failure detector
PDF Full Text Request
Related items