Font Size: a A A

Research On Correlation-Aware Compression Technology For Flash-based Storage System

Posted on:2022-12-07Degree:MasterType:Thesis
Country:ChinaCandidate:J D ZhouFull Text:PDF
GTID:2568306323971449Subject:Software engineering
Abstract/Summary:PDF Full Text Request
Currently,the demand for storage performance is getting higher and higher,and flash-based storage drives have been widely used in mobile devices such as cell phones,smart watches and tablets.However,the price of SSDs is very high compared to traditional disks,so how to improve the storage space utilization of flash memory is a critical concern.Data compression is widely used as the most intuitive data reduction solution in various storage systems.However,due to the characteristics of flash storage,data is stored in flash pages and cannot be modified after written.Any changes to the data will be converted into a new write request,which makes traditional compressed storage systems suffer from problems such as read amplification and write amplification.Therefore,traditional compression solutions cannot fully utilize the SSDs’ performance advantage.In this thesis,we first demonstrate via real-world trace analysis that correlated chunks are often prevalent and of great significance to practical accesses in storage systems.By utilizing data association in the compression system,not only can the problem of read/write amplification be solved,but also the read/write performance of SSDs can be improved.Therefore,this paper presents CoCo,a correlation-aware compression approach.When handling write requests,CoCo compresses highly correlated data and stores them in the same flash page,which improves the write performance and avoids the problem of partially invalid data.During reading,CoCo also uses the stored correlation information to improve the efficiency of caching and prefetching,thus improving the read performance of the system.To reduce the computational overhead in CoCo,this paper first designs a lightweight yet effective data structure to capture correlated chunks.Compared with existing schemes,CoCo reduces 99.8%of memory overhead and 97.8%of time overhead with less than 3%of recognition error.This paper then designs the compression engine in CoCo and further optimizs the read and write performance of the system by combining the correlation analysis module and the compression engine.Finally,this paper conducts extensive experiments with real-world traces in SSDsim.Compared to existing solutions and no compression scheme,CoCo reduces 13.7%-50.7%read latency and reduce 13.2%-25.2%write latency on average.At the meantime,CoCo eliminates garbage collection operations by 39.8%and reduces the write traffic by 23.7%.
Keywords/Search Tags:Data Compression, Data Correlation, Flash-based Storage
PDF Full Text Request
Related items