Font Size: a A A

Quantification And Deployment Of Convolutional Neural Network Based On NOR Flash

Posted on:2022-08-08Degree:MasterType:Thesis
Country:ChinaCandidate:Y Q ChenFull Text:PDF
GTID:2518306323465334Subject:Electronic Science and Technology
Abstract/Summary:PDF Full Text Request
In recent years,Convolutional neural network(CNN)has been widely used in image processing field.With the continuous expansion of the network scale and the continuous growth of model parameters,there may be millions of parameter reading and calculation operations in one calculation cycle of CNN.Due to the separation of storage and computation,the traditional Von Neumann-structured hardware is faced with the problem of storage wall,so it is not suitable to be the carrier of edge computing(computation near the device side).The device of computing in memory can break through the limitation of the storage wall by combining the compute unit and the storage unit.Among this CIM(computing in memory)devices,the floating gate device(Flash)with mature manufacturing technology is very suitable as the medium to bulid computing in memory,and deployment CNN model on the edge devices.Based on the low-precision characteristics of edge computing,this paper studies a low-bit wide convolutional neural network adapted to edge computing.The main work includes:1.Based on the principle of the Nor Flash array to realize the analog multiplication,combined with the storage characteristics of the floating gate unit,the realization method of the full 4Bit convolutional neural network model is studied.The method uses different quantization schemes for the parameters and activation in the network quantization.The method uses different quantization schemes for parameters and activation of neural network quantization.As for parameters,thresholds of floating point parameters and scaling factors are continuously optimized in the training process to make the quantization mapping more accurate.For the activation function,by introducing additional parameters into the RELU activation function,the quantization of the activation function can also be updated according to the actual situation.2.Based on the Cifar10 classification data set,three classic CNNs were quantified to verify the actual effect of the dynamic threshold quantization algorithm.And further through experiments to explore the impact of each channel quantization and layer-by-layer quantization of the weight on the accuracy loss of CNN.The experimental results show that,compared with the 32bit floating point model,the accuracy loss of the three CNN models after the full 4Bit quantization using the dynamic threshold quantization method is less than 1.5%,which ensures the accuracy of the CNN model in low precision calculation and provides a prerequisite for the CNN model to be deployed on edge devices.3.The deployment of the quantization model on NOR Flash array is studied.Based on MNIST data set,a small 4bit quantization neural network is trained,and the convolutional layer operation is implemented by using the analogue multiplication and accumulation unit composed of NOR Flash array.In the standard PVT simulation environment,the BSIM4 model of XMC FG 65nm process is used to carry out digital and analog hybrid simulation of the circuit with HSIM,and the other layers of the network are realized by software.The final statistical results show that the accuracy of the network reaches 96.12%,which is less than 2%loss than the reasoning results of the software side.The feasibility of deploying a small CNN model on NOR FLASH array is verified.
Keywords/Search Tags:Compute In Memory, Quant Neural Network, Nor Flash
PDF Full Text Request
Related items