Font Size: a A A

Design And Consistency Evaluation Of Distributed Coded Data Storage

Posted on:2020-05-17Degree:MasterType:Thesis
Country:ChinaCandidate:Y Z ZhuFull Text:PDF
GTID:2428330599951300Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
The nodes of Distributed Storage Systems(DSSs)are widely distributed and prone to failure.In order to prevent data loss caused by node failures,it is necessary to apply faulttolerant coding in the system.In DSSs,erasure coding generally have higher storage efficiency then replication mechanism under the same fault-tolerant capability.However,in distributed network systems,the application of erasure coding to store data has the following technical challenges to be addressed urgently.On the one hand,traditional erasure coding needs to recover the whole original object when repairing the fault node,which results in a tremendous bandwidth occupation.On the other hand,how to efficiently ensure data consistency of shared storage objects in concurrent read-write environment is the emphases of DSSs research.To solve the above two issues,this paper carries out the following research.1)applies the knowledge of graph theory to the construction of Local Repair Codes(LRC),aiming at effectively repairing a single fault node in a local area and reducing the bandwidth of repair.And 2)proposes a Log-Based Atomic Consistency algorithm to ensure the consistency of memory object operations.The main research contents are as follows:1)Because distributed storage systems are typically built with a great quantity of inexpensive disks,disk failures often inevitably lead to data loss.Data encoding is a necessary faulttolerant mechanism to prevent data loss.Compared with the classical Maximum Distance Separable(MDS)code,the Local Repair Codes(LRC)can effectively improve the efficiency of data repair and reduce the bandwidth usage with a certain storage space overhead.This paper uses the theory of extremal graph to reduce the storage overhead of LRC.Taking the storage nodes and coded blocks as X-class and Y-class vertices in binary graph respectively,the minimum storage space occupancy is correspondingly equivalent to calculating the minimum number of edges in this graph.Such extremal problem can be attributed to the Zarankiewicz problem.This paper uses extremal bipartite graph to model and analyze for LRC,and gives the corresponding construction algorithm.2)With the surging of massive data,large-scale distributed storage networks are becoming increasingly popular.Erasure codes have less storage redundancy than replication schemes,so they have been highly concerned and applied in the storage field,but simultaneously pose great challenges to the concurrency and consistency operations of the asynchronous DSSs.This paper presents a Log-Based Atomic Consistency(LBAC)algorithm,which is applied to Multi-Write Multi-Read(MWMR)environment.It uses the properties of Remote Direct Memory Access(RDMA)to operate memory objects concurrently,and proves its feasibility and effectiveness through theoretical analysis.LBAC uses MDS codes to provide the necessary resilience and combines a new log-based messaging mechanism to ensure the liveness and atomicity of the operation.Performance evaluation shows that LBAC has good read and write performance,and can achieve ideal load balancing and better storage efficiency.
Keywords/Search Tags:Extremal Graph, Zarankiewicz Problem, Erasure Coding, Atomicity, Consistency
PDF Full Text Request
Related items