Font Size: a A A

The Research On Data Deduplication Technology Based On Open Channel SSD

Posted on:2020-02-09Degree:MasterType:Thesis
Country:ChinaCandidate:Q Q HuFull Text:PDF
GTID:2428330590458327Subject:Computer system architecture
Abstract/Summary:PDF Full Text Request
NAND flash-based Solid State Drives(SSDs)are widely deployed in storage systems because their performance is superior to hard disks.But due to write amplification and wear,the performance and reliability of SSDs is improved by reducing write operations.In addition,statistics show that there is ubiquitous data in common data sets,and data deduplication reduces the amount of data written to SSDs and improves its performance and reliability.However,the traditional SSD-based data deduplication is very expensive and will become a performance bottleneck of SSDs.Aiming at the above problems,a data deduplication system D-pblk based on Open-Channel SSD is designed.It uses the high-performance processor to quickly calculate fingerprints and reduces fingerprint computing overhead.D-pblk adopts a dual fingerprint hashing strategy.Firstly,CRC32 is used to calculate light-weight fingerprints to filter non-repeating data.When the data is likely to be repeated,SHA-1 is used to calculate heavy-weight fingerprints to identify whether the data is duplicated,thereby reducing fingerprint computing overhead.The computing resources of host side are utilized to accelerate the calculation of fingerprints,thereby reducing the delay caused by deduplication.D-pblk implements a double-ring buffer strategy,which solves the problem that the data size after deduplication may not match the size of a flash page.The experimental results show that the data deduplication system D-pblk based on open channel SSD has a deduplication ratio of 4.61% to 31.63% for the common six workloads.Compared to the system without deduplication,D-pblk reduces the actual write traffic by up to 41% and write performance by up to 29%,without affecting read performance.
Keywords/Search Tags:Open-Channel SSD, Data Deduplication, NAND Flash, Flash Translation Layer, Multi Thread
PDF Full Text Request
Related items