Font Size: a A A

The Research And Implementation Of Distributed Storage System Fault-tolerance Mechanism

Posted on:2019-12-12Degree:MasterType:Thesis
Country:ChinaCandidate:L LiuFull Text:PDF
GTID:2428330590467474Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the explosive growth of human society data,the size of distributed storage system is growing,the number of nodes range from several thousand to hundreds of thousands,the probability of single disk or node failure is greatly increased.Therefore,fault-tolerant technology is an indispensable and important research content in distributed storageIn order to ensure the reliability of distributed storage system,it is necessary to study the high reliability technology in distributed storage system.On the basis of Blue Ocean Storage System,a distributed storage system independently developed by our laboratory,this paper studies the key technologies of the fault tolerance in distributed storage system.The main contents of this paper are as follows:(1)Aiming at the data layout algorithm,we propose a hierarchical layout algorithm.The algorithm selects the data laout loation by two hashes.It provides high data reliability and loading balancing,and support the dynamic changes of cluster scale at a low cost.(2)On the Blue Ocean Storage system,RS erasure code is implemented,it can provide higher fault tolerance than multi-copy technology and improve storage space utilization.And for the problem that data bandwidth is too large in the process of data recovery,we propose a data recovery strategy based on Prime minimum spanning tree,it can reduce erasure code data recovery network bandwidth effectively.(3)This paper proposes a disk health detection method.This method divides the disk space equally into multiple sample areas,and then randomly takes points in each sample area.Considering the performace and delay of IOPS,it can detect failure disks in a short time while ensuring the accuracy.It play an important role in timely identification and replacement of failure disks and ensuring data reliability.
Keywords/Search Tags:Distributed storage system, Data fault tolerance, Data layout, Erasure code, Disk detection
PDF Full Text Request
Related items