Font Size: a A A

ECC Assisted Deduplication For Improving The Performance Of Flash-Based Storage Systems

Posted on:2020-05-17Degree:MasterType:Thesis
Country:ChinaCandidate:W D ZhuFull Text:PDF
GTID:2428330572979100Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Flash-based storage devices have been widely deployed in smart phone,desktop and enterprise storage system.However,the performance and reliability of Flash-based storage system has been greatly restricted by the write traffic because of the write-read asymmetry and garbage collection(GC).The deduplication technology could identify and reduce redundant data to impro've the performance of storage systems,thus receiving more and more attention from both academia and industry.Although the data deduplication significantly decreases the number of writes,the performance bottleneck of deduplication-based storage systems is still the slow storage device.With the development of NVMe and 3D XPoint technologies,the performance of flash-based storage systems has been improved significantly,which bring a nontrivial change in the performance landscape of flash-based deduplication system since the performance bottleneck is switched from the I/O stack to computational cryptographic hash functions.Moreover,theoretically,the hash collision is inevitable because the hash functions in the deduplication storage systems are to convert the longer chunk of data to the shorter hash value.Therefore,the traditional hash-based deduplication storage systems have reliability problem to some extent.To address these problems,this thesis proposes an ECC assisted deduplication scheme,short for EaD,which utilizes the ECC values of each data chunks within the flash-based storage device to replace traditional hash functions to perform similarity detection,thus establishing a collision-free and high-performance flash-based deduplication storage system.EaD significantly decreases the CPU utilization because of the elimination of hash computing.According to the similarity detection results,EaD reads the similar data chunks from high-read-performance flash-based storage device to perform byte-by-byte comparison with incoming data chunks to identify and remove those redundant data chunks.In the end,EaD also maintains a prefetch cache to decrease the negative impact of additional read operations,thus improving the performance of the EaD system.Performance evaluation on our lightweight prototype implementation of the EaD system shows that the hash-collision-free EaD significantly outperforms the existing MD5/SHA-and sampling-based deduplication schemes in terms of I/O performance by up to 4.2x,with an average of 2.5x.
Keywords/Search Tags:Flash Storage, Data Deduplication, ECC
PDF Full Text Request
Related items