| With the advent of cloud computing and big data,the explosive data growth poses severe challenges to the capacity,performance,and reliability of storage systems.Threedimensional(3D)NAND(Not-And)flash memory-based solid-state drives(SSDs)have gradually dominated the storage devices replacing traditional hard disk drives,due to their high-capacity,high-performance,and low-power consumption.The storage capacity of NAND flash memory grows exponentially benefiting from the technologies including 3D stacking,multi-level storage cells,and space shrinking.However,these technologies yield a series of problems for storage devices and systems,since the capacity growth sacrifices the performance and reliability of storage medium.Solving these problems needs to indepth and comprehensively understand the characteristics of NAND flash memory.Unfortunately,the characteristics are becoming more complex and perplexing with the differentiated development of different flash technologies,leading to a getting significant gap between the characteristics understanding of storage media and the system optimization requirements.Therefore,it is imperative to delve into characterizing and modeling 3D NAND flash memory dramatically.Existing research lacks a multi-level characterization of the performance,reliability,and threshold voltage(Vth)distribution of 3D NAND flash memory,and it is still lacking in making full use of the characteristics of 3D NAND flash memory to optimize system performance and reliability.To facilitate it,this thesis makes a comprehensive performance and reliability experiment by testing seven chip models from several manufacturers,structures,and types under different interference combinations of program/erase(P/E)cycles,data retention time,and read disturbance.Multi-level and systematic characterizations regarding the correlation between the characteristics and interference factors are made.This thesis characterizes 3D NAND flash memory and analyzes the sensitivity variation and Vth shift laws at multiple level qualitatively and quantitatively.Some feasible suggestions and low-benefit analyses optimizing the performance and reliability of flash storage systems are given in multiple application scenarios.The characterizations and analyses provide a theoretical and data support for characteristics modeling and application of 3D NAND flash memory.The characteristics variation among different flash models from multiple manufacturers,types,and stacked architecture challenges the universality of existing modeling methods.The multiple interference factors and multi-level variations in 3D NAND flash memory seriously affect the accuracy of existing modeling methods also.To bridge the gap between modeling universality and accuracy,this thesis introduces machine learning algorithms into the characteristics modeling of 3D NAND flash memory,which fully considers the synergy of multiple interference and variations.A universal Vth distribution and a raw bit error rate(RBER)modeling method are proposed.Compared with state-of-the-art models,the evaluation results show the neural network-based Vth distribution model can obtain 4.91 x accuracy and 2.19 x accuracy-overhead ratio,and the Light GBM-based RBER model can achieve up to 108 x prediction stability and 8.67 x accuracy without introducing extra overhead.The rapid capacity growth of 3D NAND flash memory also leads test costs increase by dozens times,and the low efficiency of characteristic data acquisition in high-density and large-capacity 3D NAND flash memory is getting serious.In view of this,a error generation method is proposed to efficiently construct a large-capacity characteristics data sample set by learning from a small sample set.This method utilizes the conditional generative adversarial networks(c GAN)to learn the error features under the combination of multiple interference factors and generates error feature data.The experimental results show the average relative deviation between the generated and the measured layer error distribution can reach the lowest 2.88%.The relative deviation between the generated and the measured block-total error distribution achieves the lowest 2.9%.On the premise of authenticity and diversity,c GAN model running at CPU mode can generate 2400 data samples per second.The overhead of generation is negligible compared with testing the same-scale sample dataset. |