Design And Implementation Of Convolutional Neural Network Accelerator Based On Affine Quantization

Posted on:2020-01-31

Degree:Master

Type:Thesis

Country:China

Candidate:C L Zeng

Full Text:PDF

GTID:2518306518963679

Subject:Microelectronics and Solid State Electronics

Abstract/Summary:

PDF Full Text Request

In recent years,convolutional neural networks(CNNs)have been widely used in speech recognition,object detection,and image segmentation.With the rapid development of CNN algorithms,for large-scale CNNs,the computational-intensive and memory-intensive features have brought many challenges to the application of CNN.At present,the CNNs are mainly deployed in the cloud server.The terminal data need to be transmitted to the server for processing,which causes high power consumption and high latency.In order to solve these problems,FPGA-based CNN accelerators have gradually become a research hotspot.However,FPGA platforms are limited by on-chip resources and off-chip memory bandwidth.With limited resources,it is of great significance to compress the CNN models in order to implement high performance CNN accelerators.Firstly,this work analyzes the theory of affine quantization and applies affine quantization to the CNN inference process.According to the reason of the loss of data precision caused by affine quantization,different methods for obtaining the quantization parameters are proposed.The CNN quantitation inference is realized based on Tensorflow.We analyze the influence of different quantization parameters and different quantization precision on the top-1 accuracy.In order to improve the accuracy of the quantized CNN,a mixed-precision quantization theme is proposed.The experimental results show that by using the appropriate quantization parameter,the activations and weights can be quantized to 8-bits with less than 1% loss in accuracy.Then,a high-performance quantized-CNN accelerator based on the Zynq-7000 series FPGA is implemented in this paper.According to the characteristics of CNN and embedded FPGA platform,the hardware and software co-operation architecture is proposed.Under the constraints of limited on-chip resources,we use DSP and LUT to implement multipliers,analyze the parallelism and performance of the accelerator,and select the appropriate degree of parallelism.For 1�1 convolution operation,the design of the multiplexing parallelism is proposed.In order to improve the utilization of DSP,we propose an optimization strategy for implementing two 8-bits multipliers by using one DSP.In order to improve the rate of blocking,a two-dimensional DMA blocking strategy is proposed.The experimental results show that the average performance of the CNN accelerator proposed in this paper can reach 416.3 GOPS,which is several times higher than previous designs based on the same FPGA platform.The performance of our design is 3.75 times higher than that of the CPU,and the energy efficiency is 1.42 times higher than that of the GPU.

Keywords/Search Tags:

Convolutional neural network, Accelerator, Affine quantization, FPGA

PDF Full Text Request

Related items

1	Research And Design Of Key Technology Of FPGA-based Convolutional Neural Network Accelerator
2	Design And FPGA Verification Of CNN Accelerator Based On Weight Combination
3	The Algorithm Design And FPGA Verification Of Face Detection And Recogniton Based On 8Bit Quantization Neural Network
4	Research On Convolutional Neural Network Accelerator For Mobile Terminals
5	Convolutional Neural Network Model Compression And Inference Acceleration Based On Look Up Table
6	FPGA-Based Accelerator For Convolutional Neural Network
7	Design Of General-purpose Convolutional Neural Network Accelerator Based On FPGA
8	Compression Algorithm And Circuit Design Of Convolutional Neural Networks
9	Design Of Hardware Accelerator Based On FPGA For Convolutional Neural Networks
10	Research Of Scalability On FPGA-based Neural Network Accelerator