Font Size: a A A

Design And Implementation Of Layered Erasure Code Algorithm For Data Deduplication System

Posted on:2021-09-08Degree:MasterType:Thesis
Country:ChinaCandidate:Z C LiuFull Text:PDF
GTID:2518306104487844Subject:Computer system architecture
Abstract/Summary:PDF Full Text Request
Data deduplication technology is widely used in storage systems to reduce data storage overhead,but it also brings system reliability issues.Erasure coding,as a data redundancy technology,has the characteristics of low storage overhead and high error tolerance,so it is introduced into the data deduplication system to improve the reliability of the storage system.However,when the data node fails,a deduplication system using erasure coding is used,and the degraded read performance will decrease with the increase in the number of data block references,which in turn affects the overall read performance of the system.In order to improve the degraded read performance of the deduplication system based on erasure codes,the following research work was carried out.Designed and implemented the layered erasure coding algorithm of the data deduplication system,and implemented the prototype system-Level EC.Based on the number of data block citations,the algorithm places the data blocks in layers and encodes the data blocks of different levels using erasure code strips of different lengths to reduce the amount of data transmission in a single degraded read and improve The system degrades read performance.Aiming at the problem that the degraded read performance of inter-container coding is lower than the decoded read performance of inner-container code,a special optimization strategy is designed to increase the performance by increasing the limited cache to reduce the data transmission amount during decoded read of inter-container coding.For the waste of storage space caused by the layering scheme,an optimization scheme for space reuse was designed.Specifically,by filling the data hole caused by the transition of the data block level,the additional storage overhead caused by the layering strategy was reduced.Three data sets were used to test the prototype system.The test results show that the layered scheme improves the average read performance by 174.5% and the degraded read performance by 287.4% when the layered scheme loses up to 13.8% of write performance and increases 5.1% of storage overhead.The single degraded read time of the layered scheme is reduced by 12.8% compared with the unlayered scheme.Compared with the degraded read of the existing erasure code-based deduplication system,the efficiency of Level EC is up by 200.8%,407.2%,530.1% over R-ADMAD respectively when one to three node fails;and the efficiency of Level EC is up by 63.9%,407.2%,530.1% over EEC-Dedup when one to three node fails.
Keywords/Search Tags:Data deduplication, Erasure coding, Degraded read, Data block reference times, Layered coding
PDF Full Text Request
Related items