Font Size: a A A

Dynamic Stripe Construction For Asynchronous Encoding In Clustered File System

Posted on:2018-08-03Degree:MasterType:Thesis
Country:ChinaCandidate:S Z WeiFull Text:PDF
GTID:2348330515996438Subject:Computer system architecture
Abstract/Summary:PDF Full Text Request
To combine the benefits of both N-way replication and erasure coding,CFSes com-monly adopt asynchronous encoding.With asynchronous encoding,a CFS firstly stores new files with N-way replication to ensures high read performance,and then transforms the files,which become cold,from N-way replication to erasure coding to maintains low space consumption.The existing CFSes group data blocks into stripes according to their offsets within files,since logically sequential data blocks of a file are repli-cated and distributed across different racks to ensure high reliability,encoding them into some parity blocks has to download data blocks located in different racks,which induces heavy cross-rack traffic.Heavy cross-rack traffic will degrade the encoding speed and the response time to users' requests.To accelerate the speed of asynchronous encoding and lower down its influence to frontend task,in this paper,we propose an encoding scheme,Dynamic Stripe Construc-tion(DSC),to transform N-way replication to erasure coding.DSC construct encoding stripes according to the data block layout of current CFS.DSC selects a set of data blocks,each of which has a replica in the same rack and has another replica in an indi-vidual rack,to construct a coding stripe.We carefully design a data structure collecting the metadata of all data blocks in a CFS,and based on it,we propose an algorithm which can efficiently group data blocks into stripes to realize DSC.DSC is applicable to exist-ing CFSes with various erasure codes,and can be deployed on a distributed file system in a hot-plugging-in manner.To verify the effectiveness of DSC,we implement it on HDFS.Through extensive testbed experiments in a real storage cluster,we show that DSC can significantly increase the encoding throughput(81%at most in experiments)and reduce the foreground user response time over the traditional approach.In system integration,we first we first discuss the data locality and load balancing problem in the encoding process,and then design inter-file encoding and iterative en-coding to meet the demands in special environment.In adapt to the changing frequency of data block access,we combines dynamic replication and asynchronous encoding.This architecture allows us to dynamically manage the data blocks in the system,thus main-taining data reliability and access performance while minimizing the storage overhead of the system.
Keywords/Search Tags:Clustered File System, Erasure Coding, Asynchronous Encoding, Dynamic Stripe Construction, Reliability
PDF Full Text Request
Related items