Quantization Algorithm Of Deep Neural Network And Its FPGA Implementation

Posted on:2024-02-01

Degree:Master

Type:Thesis

Country:China

Candidate:B Li

Full Text:PDF

GTID:2568307079972389

Subject:Electronic information

Abstract/Summary:

PDF Full Text Request

Deep neural network has made remarkable achievements in various domains.In order to deploy a model with a huge amount of parameters and computations to edge computing platforms such as FPGA,the model needs to be compressed.Quantization is one of the methods to compress the model.FPGA is a software-defined hardware chip that can be reprogrammed many times to facilitate the update of the deep neural network.In addition,FPGA also has the characteristics of high performance,low power,parallelism,etc.Based on the above problems,this thesis studies the quantization algorithm of deep neural network and its FPGA implementation,and designs and implements an FPGAbased image classification system.The main work of this thesis is as follows:(1)In order to reduce the hardware complexity of multiplication operation implemented in FPGA,this thesis proposed the Sum of Uniform and Power-of-Two(SUPT)quantization algorithm,and proposed a deployable quantization scheme based on this algorithm.On the FPGA platform,SUPT quantization algorithm can reduce the LUT resources occupied by multiplication,so that multiplication can be efficiently implemented using LUT resources.In addition,the experimental results show that the SUPT quantization algorithm has balanced performance on multiple models and is more universal than the uniform and Power-of-Two quantization algorithm.(2)In order to make full use of LUT and DSP resources in FPGA,this paper designed a multiplier based on LUT and a multiplier-adder based on DSP at first.Then this paper designed the convolution module,the max-pool module and the fully connected module by combining the optimization methods such as the adder-tree,the ping-pong cache and the pipeline.Finally,this paper combined the above modules according to the network structure of ResNet18 and MobileNetV2,and designed and implemented the two FPGAbased accelerators.The on-chip power consumption of ResNet18 accelerator is 6.551 W,throughput is 112.67 GOPS,the latency is about 33.36 ms,and the accuracy rate can reach 70.64%;The on-chip power consumption of the MobileNetV2 accelerator is 5.42 W,throughput is 108.08 GOPS,the latency is about 5.91 ms,and the accuracy rate can reach 67.16%.(3)In order to apply the deep neural network accelerators,this thesis designed and implemented an image classification system based on FPGA using software engineering.The system supports image classification of multiple images,and can display the classification results and the latency of the accelerator to users.In addition,the system also supports the switching of deep neural network accelerator,which provides users with a variety of options.

Keywords/Search Tags:

Deep Neural Newtworks, Quantization, Field Programmable Gate Array, Accelerator

PDF Full Text Request

Related items

1	Research On Neural Network Quantization Algorithm And Its Implementation On FPGA
2	Design And Research Of FPGA-based Deep Learning Accelerator
3	Design And Optimization Of Tiny YOLO Convolutional Neural Network Accelerator
4	Research On Convolutional Neural Networks Accelerator Based On FPGA
5	SRAM Field Programmable Gate Array Design And Test Analysis
6	ZYNQ-Based Reconfigurable Convolutional Neural Network Accelerator
7	Research And Design Of Convolutional Neural Network Accelerator Based On Multi-FPGA Co-acceleration
8	Research On Convolutional Neural Network Accelerator Based On FPGA
9	Research And Implementation Of FPGA Accelerated Convolutional Neural Network Training
10	Design And Implementation Of A Neural Network Compiler Based On Heterogeneous Platform