Font Size: a A A

Research On Efficient Management And Read-Write Optimization Of 3D NAND Flash Memory

Posted on:2020-12-06Degree:DoctorType:Dissertation
Country:ChinaCandidate:Y Z FengFull Text:PDF
GTID:1368330629482977Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the explosive growth of data,the demand for higher performance,larger capac-ity,lower cost storage chips is further increased.In the past ten years,flash memory has gradually become an alternative to the hard disk for its excellent performance,shock resis-tance and portability.Due to the physical limitation of memory cell size,the sustainable development of flash memory has encountered a bottleneck.The three-dimensional stacked flash architecture provides a promising solution to the storage capacity problem.However,adopting the traditional Flash Translation Layer(FTL)to manage 3D flash memory will cause performance and reliability issues.The increase of the number of pages in a flash block and the size of a page leads to reduced storage space utilization,wasted data transfer time,and increased garbage collection overhead.Internal programming interference of 3D flash memory is more serious than that of 2D flash memory,which results in more bit er-rors and degrades the reliability of the flash memory device.To address the above issues,this thesis optimizes the FTL of Solid State Disks(SSDs)by exploiting 3D flash memory features to build high performance and high reliability 3D flash devices.The size of 3D flash page continues to increase,and the mapping granularity of existing FTLs is usually flash page or flash block,which introduces more redundant data transmission when processing small requests.It seriously affects the space utilization and performance of flash memory.To address this issue,we propose a novel multiple subpage writing scheme based on flash page re-programming to reduce the response time of SSDs and the erase count of flash memory.The key idea is to enable a finer granularity flash space management and exploit multiple subpage writes on a single page without erase.Mapping Granularity Adaptive FTL(MGA-FTL)is proposed for SLC flash memory based on the flash page re-programming feature.Different subpages within a single flash page are used to respond to multiple small requests.2-Level Mapping is introduced to serve requests of different sizes in order to control the DRAM overhead of the subpage mapping table.Meanwhile,the allo-cation strategy determines whether different logical pages can be mapped to a single phys-ical page to balance the space utilization and performance.Multiple SubPage Writing FTL(MSPW-FTL)is proposed for MLC flash memory by exploiting SLC/MLC dual mode and flash page reprogramming feature.By converting MLC mode blocks to SLC mode blocks,we use the SLC mode block to serve write requests smaller than a flash page.To manage the metadata of dual mode,SLC mode operation handler is also designed to prevent data crash.In addition,we design a Read-Intensive-Aware updates scheme and a subpage merge scheme to deal with the data fragmentation caused by the subpage-granularity allocation.Experimental results show that compared with page mapping,MGA-FTL can reduce the I/O response time,write amplification and the number of erasures by 53%,30%and 40%respectively.MSPW-FTL reduces the I/O response time by 57.2%,the write amplification by 52.1%and the number of erasures by 34.1%on average.With the high-density advantage,fewer 3D NAND chips are needed to build higher capacity embedded storage devices.However,this decrease in the number of chips means fewer parallel units,reducing channel bandwidth utilization.To address this issue,we pro-pose a Maximize Bandwidth Management(MBM)FTL based on read and write asymmetry of flash memory to reduce the response time and tail latency of SSDs and improve the I/O bandwidth.By analyzing the relationship between multiple-level parallelism executing tim-ing and channel bandwidth utilization,we propose a parallelism-enhanced Write Strategy(WS)and a parallelism-relaxed Read Strategy(RS).WS introduces an extra active block for garbage collection in each plane to enhance intra-chip parallelism.Additionally,WS maximizes write parallelism through load balancing among multiple channels and global al-location scheduling.RS focuces on ensuring that the previous request is complicated first,reducing the mixed execution of different read requests to reduce the response time.Ex-perimental results show that MBM reduces the average response time by up to 43.6%and promotes I/O bandwidth to 2.1x compared with a twim block management scheme.Specif-ically,between 99–99.999th percentiles,MBM-FTL significantly reduces the tail latency.Due to the additional interference from adjacent layer flash cells,the programming in-terference of 3D flash becomes more serious,resulting in an increased bit error rate.To solve the interference problem for better reliability,we propose a read-write optimization scheme based on Disturbance Compensation and Data Clustering(DC~2)to reduce the bit error rate and improve device performance.We sketch the physical page layout characteristics of 3D flash memory and quantitatively analyze the amount of interference that a flash cell receives from the programming cell in the same layer and the adjacent layer,and then propose Distur-bance Compensation Programming Scheme.The flash pages near the most recently written page in the active blocks are called margin pages which are more likely to have more bit errors.In the 3D flash memory,the number of margin pages is greatly increased.To ensure the reliability of the margin pages,we propose Read reference Voltage Shifting(RVS)and Artificial Compensation(AC)strategies.RVS sets different read voltages to get a better read window depending on the physical location of the error page.AC is used when an read error occures on the margin page,compensating for the threshold voltage by writing ran-dom data near the error page.Based on the deep mining of the workload access pattern,it was found that most logical pages have a small number of updates,only a few logical pages have been updated more than 4 times during runtime.Multiple-Level-Queue page alloca-tion is proposed to filter the logical addresses of different update counts and choose different data blocks to respond.Experimental results show that DC~2reduces the disturbed bit error rate by at least 82%compared with a traditional page-mapping FTL scheme.Besides,we demonstrate DC~2can achieve more effective results than the state of the art scheme when the number of pages in a block increases significantly.The write amplification,I/O response time and the number of erasures are reduced by 31.2%,22.2%and 14.6%on average,re-spectively.
Keywords/Search Tags:3D NAND Flash, Flash Translation Layer, Parallelism, Program Interference
PDF Full Text Request
Related items