Convolutional Neural Network Model Compression And Inference Acceleration Based On Look Up Table

Posted on:2021-05-04

Degree:Master

Type:Thesis

Country:China

Candidate:S Y Xu

Full Text:PDF

GTID:2428330611499322

Subject:Electronic and communication engineering

Abstract/Summary:

PDF Full Text Request

Convolutional neural networks(CNNs)have been widely applied for computer vision related tasks and have achieved dramatic accuracy improvements.However,the massive parameters and heavy computation requirements needed for CNNs limit the deployment of mobile terminals which are lack of computing power.Parameter quantization with lower bit-width is the common approach to reduce the computation loads in CNN inference.With the parameters being replaced by fixed-width binaries,multiplication operations can be replaced by the lookup table(LUT),where the multiplier-multiplicand operands serve as the table index,and the pre-calculated products serve as table elements.Because the histogram profiles of the parameters in different layers/channels differ significantly in CNN,previous LUT-based computation methods have to use different LUTs for each layer/channel,and consequently demand larger memory space along with extra access time and power consumption.In this work,we first normalize the parameters' Gaussian profiles of different layers/channels to have similar means and variances,and further quantize the normalized parameters into fixed-width through iteratively clustering.Because of the normalized parameters' profile,only single compact LUT(16 � 16 entries)is needed to replace all multiplications in the whole network.Experiments in image classification tasks demonstrate that with a compact 256-entry LUT,we can achieve the accuracy comparable to the results from 32-bit floating-point calculation;while significantly reduce the computation loads and memory spaces.Compared to previous work used LUT-based convolution,the size and quantity of LUTs used for CNN are significantly reduced in this work.To verify the effectiveness of the algorithm at the hardware level,this work implements a CNN inference system based on single lookup table,using FPGA as the target hardware platform.Based on the characteristics of lookup table multiplication computation,a synchronous data flow computational architecture for LUT based CNN is designed.A novel set of optimizations e.g.memory partition,stream rearrangement which enables efficient mapping of LUT based network to hardware,are proposed in this work.Basic CNN modules including LUT based convolution,pooling layer and fully connect layer are implemented with C++.Experiments show that LUT based CNN on PYNQ-Z2 FPGA platform superior to fixed-point implementation in resource usage,latency and throughput.Experiments show that LUT-based CNN implementation can save 56.1% of BRAM and 52.1% of DSP utilization and 21% of power consumption compared with fixed-point implementation,and achieve nearly 4.5GOPs/s computing throughput on PYNQ-z2,which is 59� faster than ARM cortex A9 processor.

Keywords/Search Tags:

Convolution Neural Network, Network quantization, FPGA, Accelerator, Power Efficient Design

PDF Full Text Request

Related items

1	Design And Optimization Of Convolution Array Accelerator Based On FPGA
2	Design And Implementation Of Energy-Efficient Configurable Convolution Accelerator For CNN
3	The Design And FPGA Verification Of CNN Accelerator Based On Group Pruning
4	Design And Implementation Of Convolutional Neural Network Accelerator Based On Affine Quantization
5	Research On The Convolution Neural Network Accelerator For Image Recognition
6	Design And Optimization Of Configurable Hardware Accelerator For LSTM Neural Network
7	The Algorithm Design And FPGA Verification Of Face Detection And Recogniton Based On 8Bit Quantization Neural Network
8	FPGA-Based Design And Implementation Of Energy-Efficienct LSTM Prediction Accelerator
9	Accelerator Design And Research Of Depthwise Separable Convolutional Neural Network Based On FPGA
10	Bit-width And Sparsity Adaptive Accelerator Research And Design For Convolution Neural Network