Design And Implementation Of Hybrid Model Compression For Image Classification

Posted on:2022-04-25

Degree:Master

Type:Thesis

Country:China

Candidate:K Zheng

Full Text:PDF

GTID:2518306512996079

Subject:Electronic information technology and instrumentation

Abstract/Summary:

PDF Full Text Request

In recent years,deep learning has been widely used in the field of computer vision,surpassing traditional vision algorithms and even human performance in tasks such as image classification,object detection,semantic segmentation,super-resolution,and face recognition.However,the deep learning model relies on a deep network structure,a huge amount of parameters and calculations,and a large amount of training data.While bringing high precision,it also has higher requirements on storage devices and computing devices.High memory footprint and slow running speed have become one of the biggest obstacles to the deployment of deep learning practical applications.Therefore,deep learning model compression and optimization acceleration has great academic value and engineering significance.This thesis conducts research on image classification tasks based on convolutional neural networks,analyzes,compares and improves existing model compression methods,proposes a hybrid model compression algorithm,and proposes a generalized implementation based on this algorithm,so that the algorithm can be quickly and easily applied in a practical application scenario,narrow the gap between model training and model deployment,and accelerate the application of deep learning.In order to fully compress and accelerate the convolutional neural network for image classification tasks,this thesis proposes a hybrid model compression algorithm.First,the self-distillation algorithm is used to improve the training accuracy of the original model and obtain a series of auxiliary branch structures while increasing a small amount of training cost.Then,the model is pruned at the filter scale and fixedpoint quantized in turn,and the pruned and quantized model is retrained with the help of knowledge distillation technology.Then all BN layers in the model are merged into the convolutional layer to further speed up the inference speed of the model.Finally,with the help of the auxiliary branch structure generated by the self-distillation algorithm,the early stop algorithm is implemented.When the auxiliary branch obtains a higher prediction confidence,the inference process is terminated early to achieve dynamic model compression and inference acceleration.This article will discuss the effectiveness of each single model compression technology and the compatibility between various technologies,and finally combine various single technologies to achieve the maximum compression effect.At present,most open source projects for model compression adopt hard-coded programming styles,which require a large number of modifications to existing model definitions,and it is difficult to universally extend the technology to other models.There are few general software frameworks for deep learning model compression,especially Py Torch's model compression software framework based on dynamic graphs is even more difficult to implement.Based on the proposed hybrid model compression algorithm,this thesis uses the software features of Py Torch to complete the capture and analysis of the calculation graph of the original model.Then,without modifying the original model definition code,a series of auxiliary modules can be used to provide configurable Model compression interface,to get the general realization of model compression algorithm.The main innovations of this thesis are as follows:1)Analyze and compare existing model compression algorithms,combine the convenience and compatibility of practical applications,improve and organically combine several algorithms to achieve a hybrid model compression scheme for image classification tasks;2)Based on the design of the mixed model compression algorithm,combined with the software characteristics of PyTorch,a general realization of the model compression algorithm is obtained.

Keywords/Search Tags:

Image Classification, Convolutional Neural Network, Model Compression, Knowledge Distillation, Pruning, Quantization, Early Stop

PDF Full Text Request

Related items

1	Research And Application Of Model Compression Algorithm Based On Pruning-quantization-knowledge Distillation
2	Research On Model Compression Method Of Deep Convolution Neural Network
3	Research Of Model Compression Method Based On Quantized Convolutional Neural Network
4	Research On Convolutional Neural Network For Compression Algorithm
5	Pruning Neural Networks Based On Stochastic Gradient Sparse Optimization
6	Sparsity Constraint And Pruning Based On Convolutional Neural Network
7	Research On Compression Algorithm Of Convolutional Neural Network
8	Study On Convolutional Neural Network Compression Methods Based On Pruning And Quantization
9	Research On Model Compression And Acceleration For Deep Neural Network
10	Deep Model Compression In Computer Vision