Research On Deep Learning Model Quantization And Related Compression Technologies

Posted on:2021-09-16

Degree:Master

Type:Thesis

Country:China

Candidate:Y L Zhou

Full Text:PDF

GTID:2518306503972639

Subject:Electronics and Communications Engineering

Abstract/Summary:

PDF Full Text Request

With the fast development of deep convolutional neural networks,image classification,object detection and semantic segmentation have achieved great breakthroughs in recent years.While at the same time,the amount of parameters and calculations required by CNNs is also increasing,making it a great challenge to deploy the networks on resource-constrained heardwares.Therefore,it is necessary to study model compression algorithms to compress the existing convolutional CNN models,and to minimize the memory usage and calculation amount of the model.We conduct a study on CNN model compression and acceleration algorithms.We analyze existing model compression methods and complete two works in model quantization tasks.Firstly we quantize CNN parameters to fixed-point numbers to compress model size and improve efficiency.According to the parameter distributions of CNN models,we propose a quantization method based on scale factor estimation.Further on we use this method to quantize ICNet,a realtime semantic segmentation network.We utilize channel-wise weight quantization method and progressive quantization strategy to reduce accuracy loss.Finally our 4-bit quantized ICNet has a 4% accuracy loss,but the model size is reduced by 8 times.Secondly,we study the deployment and implementation of 8-bit networks.For pavement detection tasks,we use TensorRT Inference Accelerator to optimize the network and deploy on GPU devices.Aiming at the problem that TensorRT cannot realize our quantization method,we implement INT8 forward inference based on CUDA programming and cuDNN neural network acceleration library.Experimental results show that compared with full-precision networks,our framework can double the inference speed,while the accuracy loss of the quantized network is only about 0.3% lower than that of the original network.

Keywords/Search Tags:

Deep Learning, Convolutional Neural Networks, Model Quantization, Semantic Segmentation

PDF Full Text Request

Related items

1	Images Semantic Segmentation Based On Deep Convolutional Neural Networks
2	Research On Image Semantic Segmentation Based On Deep Learning
3	Research On Infrared Image Semantic Segmentation Technology Based On Deep Learning
4	Research On Semantic Segmentation Method Of RGB-D Image Based On Deep Learning
5	Research On Semantic Segmentation Of Road Scene Images Based On Deep Learning
6	Researches On Semantic Segmentation And Classification Based On Convolutional Neural Networks
7	Research On Semantic Segmentation Algorithm Of Deep Convolutional Neural Network Based On Mixed Task Cascade
8	Research On Semantic Segmentation Method Based On Deep Learning Convolutional Neural Network
9	Research On Scene Segmentation Based On Convolutional Neural Network
10	Research And Realization On Technology Of Image Semantic Segmentation Based On Deep Learning