Font Size: a A A

Design And Optimization Of Erasure Code Scheme For SSD-based Distributed Storage System

Posted on:2019-01-27Degree:MasterType:Thesis
Country:ChinaCandidate:Y P PengFull Text:PDF
GTID:2428330563492480Subject:Computer system architecture
Abstract/Summary:PDF Full Text Request
With the development of cloud computing,big data,and mobile Internet,the amount of data is increasing rapidly,and the challenges for storage systems are becoming more and more daunting.The HDD-based traditional distributed storage system is difficult to meet the requirements of information storage.So,high-performance SSDs are introduced to alleviate this situation.With the continuous decline in the price of SSD,the all-SSD storage system becomes possible.However,the classic replication strategy will cause a large number of writes to the SSD,affecting the overall reliability of the system.To reduce writes to the SSDs that caused by replication policy in distributed storage environment,we introduce the erasure code protection strategy.We design a hybrid storage architecture,the small file is stored in the replication mode,and the large file is directly stored in the erasure code mode,to solve the wear problem caused by the random write of small files.By monitoring all the SSDs in the cluster,the partially accelerate wear strategy can reasonably determine the location of the parity data.In the early period,all SSD wear is balanced.When the remaining life is 20%,the parity is distributed over a fixed set of nodes and these SSDs wear more quickly.We implement a parallel coding strategies,which greatly increases the speed of coding.In addition,the data in the replication area is regularly scanned,and the cold data is converted into erasure code scheme.Therefore,we design a replica placement strategy for small files.The primary replicas of multiple small files are combined into new strips together,in order to reduce writes to SSDs.We use iozone to test the read and write performance of large files,and the filebench to simulate the access of various workloads,to test the changes in the amount of data written to SSDs and the bandwidth of the system.The test results show that under various loads,the system can maintain the write balance in the early stage,accelerate the wear of some nodes in the later stage,and meet the expectations of the partially accelerate wear strategy.The hybrid storage and partially accelerate wear strategy increase the computational overhead,but the parallel coding strategies still brings about 11.1% to 21.5% improvement in write performance.Compared with the replication mode,the write performance in erasure code increased by 3.5% to 29.5%,which verifies the feasibility of online coding.The replication placement strategy for small files can also reduce the write amplification caused by data transcoding,making it only 5.6% to 14.8%.
Keywords/Search Tags:distributed storage, solid state disk, erasure codes, wear
PDF Full Text Request
Related items