Font Size: a A A

Research Of Optimization With High Performance Multi-streamed SSDs

Posted on:2022-05-16Degree:MasterType:Thesis
Country:ChinaCandidate:B ZhouFull Text:PDF
GTID:2518306482989549Subject:Computer technology
Abstract/Summary:PDF Full Text Request
Solid state drives(SSDs),which are constructed with NAND flash memory,have been widely adopted in data centers,web servers and cloud computing due to their high performance and low energy consumption.Different from traditional storage devices,SSDs have several limitations.NAND flash memory cannot do data update in place.To solve this issue,out-of-place update has been used by invaliding old data and writing new data to free space.However,once free spaces are consumed,a time consuming and lifetime impact process,called garbage collection(GC),is activated to reclaim the invalid data.GC will impact the performance and lifetime of SSDs.In order to optimize the GC process,multi-streamed SSDs have been proposed and widely developed.Its basic idea is to identify data with similar lifetime,we call it a stream,and write them to the same flash block group.The host can pass a stream ID along with write requests to the multi-streamed SSD,which convey hints on hotness of data.With this scheme,the overhead of GC process is significantly improved.One critical issue for the design of multi-streamed SSDs is to identify the data with similar hotness,which is called stream identification.However,there is no scheme works well when data are updated in an out-of-place manner.What's more,none of previous works discussed how to efficiently utilize the stream information in the SSD controller,especially take write cache into consideration.When the write cache design is not aware of multi-streams,some problems may happen among streams and inside streams.First,data from different streams have different hotness,which may conflict with each other inside cache.Second,due to that it is a challenge to identify streams,the identification of streams may not be accurate.This paper focuses on the optimization of high-performance multi-streamed SSDs,including the design of stream identification scheme when data are updated in an outof-place manner and the study of the interaction between write cache and multi-stream technology.We summarize our contributions as follows.1)A stream identification scheme based on the append-only feature of F2 FS is proposed,which assigns different stream ID to the different log areas of F2 FS.This scheme can solve the problem that the logic space layout of data is not consistent with the physical space layout.2)A stream based write cache partitioning scheme is proposed to separate the management of data from different streams,which can solve the inter-stream conflict problem.This scheme can optimize the design of write cache with multi-stream information.3)An intra-stream based active cache evicting scheme is proposed to actively evict data to blocks with as many invalid pages as possible,which can solve the intra-stream inaccuracy problem.This scheme can optimize stream identification results through the design of write cache.We verify the effectiveness of the proposed schemes through experiments on real hardware platform and simulator.Experiment results show that when there are many small files with frequent granular random update,the proposed stream identification scheme can reduce the amount of extra writes to the SSD 800 times,erasing times 4times lower,make WAF dropped to close to 1.The proposed cache manage schemes are able to further reduce the write amplification of SSDs with negligible cost.In general,the schemes proposed in this paper has important reference significance for the optimization of high-performance multi-streamed SSDs.
Keywords/Search Tags:SSD, Multi-Stream, file system, cache
PDF Full Text Request
Related items