Font Size: a A A

CNN Accelerator Design And Optimization Based On Approximate Calculation And Data Scheduling

Posted on:2019-12-05Degree:MasterType:Thesis
Country:ChinaCandidate:Z Y ZhuFull Text:PDF
GTID:2428330590475504Subject:Integrated circuit engineering
Abstract/Summary:PDF Full Text Request
Convolution Neural Network occupies an important position in deep learning,its network has better robustness and easy to train and optimize.Due to the problems of intensive storage and diversification of network parameters,an energy-efficient CNN accelerators in the low-power embedded field has become urgently needed.This thesis is based on the structural characteristics of the algorithm of CNN,the research and optimization of approximate calculation and data scheduling,the power consumption of the CNN accelerator is reduced under different application scenarios and application requirements.Firstly,by analyzing the redundancy and fault-tolerance characteristics of the CNN,a hierarchical preprocessing scheme for the network parameters is used to reduce the computational power consumption and the amount of memory accesses.Then,based on the preprocessed weight data,the approximate calculation of different hierarchical bit widths is realized,solving the problem of large multiplication power consumption;on the other hand,the data reuse trajectory based on convolution operation is studied on the arithmetic array,a multi-output addition tree routing interconnect structure was proposed.The tightly coupled storage of the array adopts a ping-pong buffer structure,different size convolution operations are performed to achieve an efficient and flexible mapping scheme.Then,the array supply voltage is optimized and adjusted based on large-size matrix cutting,it solves the problem of data imbalance between storage and computing arrays,thereby realizing energyefficient CNN accelerator data scheduling design.The experimental results show that the accelerator designed in this thesis can achieve 4-16 bit calculation for different CNN models on the TSMC 45 nm process.The power at 16 bit and 1.1V only 142 mW,and the energy efficiency achieves 0.681TOPS/W;The power When the input data at 4bit and 0.8V only 27 mW,and the energy efficiency achieves 3.59TOPS/W.Compared with mainstream designs,this article achieves 1.27 to 2.83 times improvement in energy efficiency.
Keywords/Search Tags:Convolutional neural network, accelerator, quantization compression, approximate calculation, data scheduling
PDF Full Text Request
Related items