Research On Neural Network Quantization Algorithm And Its Implementation On FPGA

Posted on:2024-04-09

Degree:Master

Type:Thesis

Country:China

Candidate:Y Q He

Full Text:PDF

GTID:2568306914458204

Subject:Communication engineering

Abstract/Summary:

PDF Full Text Request

As an important foundation technology for the practical deployment of Neural Networks(NNs),NN compression algorithm mainly includes quantization,distillation,and pruning.Among them,the quantization algorithm is an effective way to compress the model size,and reduce the system power consumption and delay.The thesis focuses on the NN quantization algorithm,using Field Programmable Gate Array(FPGA)platform for algorithm deployment and system performance verification.First,the theoretical basis of the NN quantization algorithm,including the common symbols and concepts used in this field,are introduced.Based on this,considering the deployment scenario of the edge-side FPGA,the thesis mainly discusses and solves the problems of data distribution imbalance in Post Training Quantization(PTQ)and software-hardware alignment in Quantization Aware Training(QAT).The contributions of the thesis are as follows:(1)For the PTQ scenario,two algorithms,namely cross-layer weight equalization and intra-layer activation-weight equalization,are proposed to solve the problems of data distribution imbalance in weight and activation,respectively.(2)For the QAT scenario,an algorithm that re-estimates Batch Normalization layer statistics is proposed to improve the accuracy of the model.Besides,benefiting from an integer-only QAT algorithm and an overflow-aware QAT algorithm,the thesis simulates the inter-layer requantization and accumulator’s overflow in training,which minimizes the error between software training and hardware deployment to the greatest extent.(3)The thesis further proposes a NN quantization and implementation platform which realizes the complete conversion from the trained NN in deep learning framework to FPGA deployment,providing a tool to verify the effectiveness of the algorithms.For the single-object detection task of DaJiang Innovations’drones,the thesis deploys the aforementioned algorithms on an Ultra96V2 board in two scenarios of PTQ and QAT,and evaluates the effectiveness of the proposed methods.Experimental results demonstrate that the proposed algorithms can significantly reduce the model size and achieve an Average Intersection Over Union(AIOU)improvement in practical deployment.With a specific quantization configuration,the model parameter volume is compressed significantly.For the PTQ scenario,by equalizing the data range,the AIOU can be increased from 69.29%to 71.56%.For the QAT scenario,the AIOU can be increased from 72.35%to 73.54%through software-hardware alignment,almost without any loss compared to the original floating-point model.

Keywords/Search Tags:

Post Training Quantization, Quantization Aware Training, Field Programmable Gate Array

PDF Full Text Request

Related items

1	Research And Application Of Neural Network Quantization Aware Training Methods
2	Quantization Algorithm Of Deep Neural Network And Its FPGA Implementation
3	Research On Quantization Method Of Mixed Precision Super-resolution Model
4	A Research On Signal Recognition Based On Lightweight Network
5	Study Of Mixed Precision Quantization Of Convolution Neural Network
6	Research And Implementation Of FPGA Accelerated Convolutional Neural Network Training
7	SRAM Field Programmable Gate Array Design And Test Analysis
8	The Research And Application Of DDS Based On FPGA
9	Design And Implementation Of Enterprise Post Training Integrated Management System
10	Research And Design Of Carrier Synchronization Algorithm Of UFMC System