Research And Application Of Neural Network Quantization Aware Training Methods

Posted on:2024-09-16

Degree:Master

Type:Thesis

Country:China

Candidate:Y Dan

Full Text:PDF

GTID:2568307079472274

Subject:Electronic information

Abstract/Summary:

PDF Full Text Request

Convolutional neural network has made a great breakthrough in the field of artificial intelligence,but high computing cost and high memory usage hinder its wide application in edge devices.Therefore,a model compression scheme is urgently needed to train an economical and effective convolutional neural network model.Model quantization can significantly reduce the compressed space of the model,and has the advantages of low memory consumption,low energy consumption and efficient reasoning.Neural network quantization aware training is a quantitative training method to reduce the quantization error of models.The existing quantization methods have some shortcomings,such as forward propagation and backward propagation gradient mismatch,quantization parameters need to be manually adjusted,target detection model quantization is not complete and so on.In view of the above problems,the main contents of this thesis are as follows:(1)STE(Straight Through Estimator)can effectively propagate gradients during training,but its rough gradient mapping can lead to gradient mismatch.Therefore,STE method is improved in this thesis,a quantization algorithm based on relaxation mapping function is proposed,and trainable quantization parameters are set to solve the problem of serious loss of accuracy and gradient mismatch during training.By using Res Net34 network on public datasets MNIST and CIFAR10,the accuracy of the proposed algorithm is improved by 0.04 and 1.02,respectively,which proves the effectiveness and superiority of the proposed algorithm.Compared with the original STE,the convergence speed of model training is accelerated.When the quantization accuracy of weight value/activation value is 2bit/2bit,the accuracy is increased by 2.7 percentage points,and the average reasoning time of each image is reduced by 1.46 ms.(2)The object detection task has more complex operators,which are ignored in the image classification task.Aiming at the problems such as incomplete quantization,serious loss of accuracy and slow reasoning speed of the object detection model,this thesis quantized the special operators in the object detection model,balanced the weight ranges of adjacent convolutional layers,fused the convolutional/batch-normalization layer and convolutional/relu layer in the network,and proposed a complete set of object detection model quantization algorithms.By using the YOLO model on COCO dataset,the average reasoning time of each image is reduced by 8.7ms,which verifies the effectiveness of the proposed algorithm.After the model was quantified on VOC dataset,the model reasoning was successfully accelerated by 4 times.(3)The intelligent UAV perception system is designed and implemented.The quantization algorithm proposed in this thesis is integrated with the actual tasks of the UAV,and the UAV visualization platform and model quantization platform are developed.By testing and analyzing the model reasoning performance before and after quantization on an airborne computer,it is found that the model has been improved in different degrees in reducing memory,reducing power consumption,accelerating reasoning and so on,which further verifies the feasibility and effectiveness of the proposed algorithm.

Keywords/Search Tags:

Convolutional Neural Networks, Model Compression, Quantization Aware Training, Object Detection

PDF Full Text Request

Related items

1	Study Of Mixed Precision Quantization Of Convolution Neural Network
2	Research On Technology Of Model Compression For Convolutional Neural Networks
3	Research Of Model Compression Method Based On Quantized Convolutional Neural Network
4	Researches On Inference Acceleration Of Convolutional Neural Networks For Object Detection
5	Research And Application Of Parallel Training And Model Compression For Deep Neural Networks
6	Researchs On Compression Algorithms Of Convolutional Neural Networks
7	Quantization Algorithm Based On Progressive Optimization And Distribution-aware Analysis
8	Research On Neural Network Quantization Algorithm And Its Implementation On FPGA
9	Model Compression And Hardware Acceleration Of Convolutional Neural Networks
10	Research On Deep Convolutional Neural Network Training And Inference Optimization For Edge Intelligence