Font Size: a A A

Research On Multi-fault-tolerant MDS Array Codes In Distributed Storage Systems

Posted on:2021-05-22Degree:MasterType:Thesis
Country:ChinaCandidate:W YuanFull Text:PDF
GTID:2518306725952369Subject:Software engineering
Abstract/Summary:PDF Full Text Request
In the era of big data,the amount of data that needs to be stored worldwide is increasing every day.People pay an increasing number of attention to the value of data.However,with the expansion of the cluster scale,it is no longer an accident that node failure in a distributed storage system leads to data loss.As the traditional fault-tolerant technology,although multi copy technology is simple and easy to implement,it produces a lot of redundant data,which leads to low space utilization of nodes and serious waste of resources.The erasure code fault-tolerant technology can tolerate multiple node failures when only a small amount of redundancy is generated,which ensures the high reliability and high space utilization of the distributed storage system.As a type of erasure code that has extremely high encoding efficiency and decoding efficiency and involves only XOR operations,array codes have been widely used.But the traditional array code has some limitations,which affect the practicability of the array code.One is the inability to balance the storage efficiency and fault tolerance.The maximum distance separable(MDS)array code can guarantee the specific fault tolerance with the least redundancy,but the fault tolerance is low.Although some array codes improve their fault tolerance,they are all implemented at the expense of MDS.Therefore,when the scale of distributed storage system increases,it can not achieve the optimal space utilization while improving the fault tolerance to ensure high reliability.Secondly,the majority of array codes have strict restrictions on the stripe size,which is not conducive to the realization of the system expandability.In view of this situation,after analyzing and comparing a variety of array codes,this thesis proposes the CMA code,which is a multi fault tolerant MDS array code(more than 3fault tolerance).The specific work content is as follows.This thesis analyzes and compares erasure codes with multi-fault-tolerant or MDS characteristics,including RS code,EVENODD code,STAR code,RDP code,X code,Grid code and Slope code.By analyzing the encoding and decoding process of these erasure correcting codes,finds out their advantages and limitations,and makes preparations for MDS array codes with higher fault tolerance.The CMA code is proposed,which is a MDS array code with multiple fault tolerance characteristics and the fault tolerance capability is greater than 3.Instead of the traditional way of building the check element generation code chain with the horizontal line and the diagonal line with slope of 1 or-1,matrix is used to organize and build check element generation code chain.The fault tolerance of the array code is unlimited.It gets the best space utilization ratio under the condition of fault tolerance and has no strict limit to the strip size.In this thesis,an encoding matrix construction optimization algorithm is proposed.When using a matrix to construct check elements to generate the code chain,it is found that the computation amount of array codes constructed by different matrices is quite different.Therefore,in order to improve the encoding and decoding efficiency of the array code,an encoding matrix construction optimization algorithm is proposed to construct an encoding matrix with less computation.The erasure code error tolerance technology is embedded in Ceph to build a test platform.In this system,CMA code and other erasure codes are utilized in simulation experiments.The performance analysis and comparison of CMA codes are performed based on the experimental data.The experimental results show that while the CMA code meets the expected goals of high fault tolerance and MDS,the encoding efficiency is not inferior to the array codes constructed with geometric forms such as EVENODD codes,and the decoding efficiency is not much different from EVENODD codes.
Keywords/Search Tags:Multiple fault tolerance, Maximum distance separable code, Array code, Distributed storage
PDF Full Text Request
Related items