Font Size: a A A

Non-volatile Memory Device Based Neural Network Accelerator Design

Posted on:2021-02-24Degree:DoctorType:Dissertation
Country:ChinaCandidate:Y ChenFull Text:PDF
GTID:1488306122479854Subject:Control Science and Engineering
Abstract/Summary:PDF Full Text Request
Recently,deep belief network(DBN)and convolutional neural network(CNN)have achieved significant successes in many applications.For example,CNN can achieve even beyond human accuracy performance in image recognition.These successes come from two sources: the going deep layers in neural network(NN)for complex tasks and the availability of huge data for sufficient training.Though they contribute to accuracy improvements,both impose significant challenges in computation and storage.To address these challenges,it is urgent to build efficient accelerators for the going deeper NNs and large-scale input datasets.Among various hardware solutions,non-volatile memory(NVM)based computing architectures,functioning as one of the potential candidates in breaking the famous memory wall problem,are promising in efficient NN accelerations.This is because NVM based architectures can process computation in memory.This means that they do not need to move data from memory to process elements,but compute in memory directly.The basic cells of NVM architectures include memristor and memcapacitor.Compared with conventional von Neumann based processors,which partition storage and computation,NVM based accelerators integrate computing into memory to significantly reduce memory access overhead.However,existing NVM based hardware architectures still face following challenges.1)As restricted Boltzmann machine(RBM,a basic architecture of DBN)includes frequent forward and backward propagations during a training phase,prior NVM based RBM accelerators have to take lots of hardware resources for dealing with intermediate results and incur low energy efficiency;2)Though CNNs have significant reusable inputs,which come from both rows and columns of input feature maps,existing NVM based memory architectures cannot support both row and column oriented accesses,which incur inflexibility and huge energy consumption;3)Since reusable input activations are overlapped in CNNs,prior memristor based CNN accelerators cannot make fully use of the reusable data,which result in significant energy consumption;4)Because neuron circuits take significant area and power consumption,prior NN accelerators are difficult in scaling to a large size.To address the above four challenges,we conduct the research on developing efficient NVM based accelerators for NN computation.Details of the contributes are summarized as following.1.We propose a novel memristor based RBM accelerator,which computes the forward or backward propagation in one cycle to improve computation energy efficiency.The RBM accelerator adopts two memristor crossbar arrays,which store weight parameters(for RBM forward propagation)and transposition weight parameters(for RBM backward propagation)respectively.As the capability of computing in memory,the weight crossbar arrays directly compute the forward and backward multiply-and-accumulate(MAC)based on the stored data to avoid memory accesses.Besides,the accelerator uses memristor-based latches for temporary storage,and deploys the latches and MAC computation crossbar in parallel for performing neuron output storage and MAC operation simultaneously.Therefore,the accelerator can perform RBM forward or backward propagation with low power in one cycle to improve energy efficiency for CD training phase.2.To enable both row and column oriented accesses,we propose a memristor-CMOS hybrid memory architecture for two-directional(2-D)accesses.The storage cell in the memory consists of two memristors and two CMOS transistors.The memristors store data and the transistors enable the row and column oriented accesses.The proposed 2-D memory architecture can efficiently exploit reusable activations of CNN to reduce the memory access overhead for large-scale CNN accelerations.3.To resolve the overlapped input activations in CNNs,we propose a novel memristor based accelerator,which reuse data for efficient accelerations.We introduce a novel dataflow,which partitions both input feature maps and kernel weights into pieces based on the kernel stride.In this case,pieces of inputs are independent and can be reused without the impact of the stride.The proposed accelerator deploys memristor crossbar arrays based on the kernel pieces for reusing the input pieces to achieve lower energy overhead.4.To address the scalability problem,we propose a memcapacitor based NN accelerator,which retrieves specific functions approximately by programming memcapacitors with specific NN topologies.In the calculator,memcapacitors are associated to a Neuron-MOS structure,which directly couples multiple memcapacitors on the floating gate of a MOS transistor to reduce chip area overhead.Besides,the accelerator consumes very low energy because of the capacitive coupling effect.In summary,the above designs of the NVM based architectures can efficiently explore the saving space of the computation cost for large-scale NN accelerations,which have important theoretical significance for the design of corresponding functional circuits.
Keywords/Search Tags:neural network, emerging non-volatile memory, memristor, memcapacitor, low power computation, hardware acceleration
PDF Full Text Request
Related items