Research On Quantization Method Of Mixed Precision Super-resolution Model

Posted on:2022-11-03

Degree:Master

Type:Thesis

Country:China

Candidate:J Y Liu

Full Text:PDF

GTID:2558307169480264

Subject:Engineering

Abstract/Summary:

PDF Full Text Request

The computing power and storage capacity of today’s hardware can already allow various fields of machine learning to give full play to their powerful performance,such as computer vision tasks in deep learning,which are used in industry,commerce,and navi-gation,communications,and aerospace with great effect.However,with the continuous increase of the model,it consumes more and more memory,and the training and inference time becomes longer,resulting in the decrease of computing efficiency and the increase of system energy consumption.This hinders the application of deep learning model in edge devices,mobile devices and micro devices.In order to solve these problems,model compression technology is gradually emerging and studied,such as model quantization.In the existing quantization methods,the quantization perception training considers the accuracy loss caused by data mapping in model training,that is,the weight and other data are clamped and approximated during quantization,and the resulting quantization errors are updated together in back propagation through the loss function,so as to optimize the quantization results.This paper takes the super-resolution model as the research object.Super resolution technology is a typical application of deep learning.Its main work is to enlarge a small image and still keep the texture details clear.At present,the whole super-resolution model is quantized with the same accuracy.Although the model size and inference time have been optimized a lot,the model accuracy has basically decreased significantly,and the impact on the model accuracy after quantization in different stages of the super-resolution model is also different.Quantization in some stages will reduce the model accuracy more,The quantization in some stages has little effect on the accuracy of the model.In order to get better quantization results of super-resolution model,this paper pro-poses a framework of mixed precision quantization super-resolution model.The frame-work can automatically quantize the mixed precision of existing super-resolution models.While greatly reducing the model size and model inference time,the effect is also very close to the original precision model.The main work and innovations of this paper are as follows:1.The concept of ”quantization sensitivity” is proposed.Quantization sensitivity is a phenomenon,that is,when quantizing a network model,the errors caused by quantizing different stages in the network are very different.The reason for quantization sensitivity is that the weight parameters of each stage in the network model have different importance in the whole model.If these stages are not sensitive to quantization,they will not affect other stages,and vice versa.On this basis,a multi-stage super-resolution model quantization scheme is proposed.In the training model,each stage of the super-resolution model is quantized,and individual or several stages are combined for quantization.In this way,the problem of quantizing the whole model with the same accuracy can be solved,and the quantization results of the model at which stages will be more ”sensitive” can also be observed.2.This paper designs and implements a super-resolution model quantization method with mixed accuracy.In order to optimize the quantization accuracy,inference time and model size of the super-resolution model,based on the previous multi-stage quantization results,we will use the higher digit quantization in the sensitive stage and the mixed ac-curacy quantization method of the original accuracy quantization in the insensitive stage.This method not only reduces the size of the model,but also shortens the inference time,and the effect of the model is similar to the original precision model.3.In this paper,a mechanism for automatically selecting quantization accuracy is proposed,and the framework of mixed accuracy quantization super-resolution model is designed.In the multi precision quantization stage,we find that the ratio of the number of input and output channels in the convolution kernel of the sensitive stage is quite dif-ferent,that is,the dimension of data conversion is very different.Therefore,this paper implements a mechanism for automatically selecting the sensitive stage in this framework.When traversing the whole model,The quantization accuracy is selected according to the ratio of the number of input and output channels in each stage convolution kernel,which improves the automation of the whole framework.This paper selects two typical super-resolution models: SRGAN and ESRGAN as test objects.These two super-resolution models are deep learning models based on Coun-termeasure generation network.This paper analyzes and compares the quantized results of the two models from three aspects: model size,inference time and reconstruction effect.At the same time,in order to prove the generalization ability of the multi-precision quan-tization method proposed in this paper,eight typical image sets in the super-resolution field are used as the test set.It is found that the average error after quantization is only increased by 5.07% and 0.18% respectively.Even the results of mixed quantization on some data sets are better than the original model.

Keywords/Search Tags:

Model quantization, Quantization sensitivity, Mixed-precision quantization framework, Super-resolution, Quantization aware training

PDF Full Text Request

Related items

1	Study Of Mixed Precision Quantization Of Convolution Neural Network
2	Research On Neural Network Quantization Algorithm And Its Implementation On FPGA
3	Study On Quantization Methods Of Physical Layer Secret Key Generation
4	Research And Application Of Neural Network Quantization Aware Training Methods
5	Research Of Receiving Technologies For Communication Systems With Low-precision Quantization
6	Study Of Low Bit-width Quantization Of Deep Convolutional Neural Network
7	Quantization For Approximate Nearest Neighbor Search
8	Mixed-precision Quantization Methods For Convolutional Neural Network Compression
9	Research Of Compressive Sensing Encoding And Decoding Scheme With Low-precision Quantization
10	Technologies For Ditiatal Communication System Based On Low Precision Quantization