Research On Model Compression Algorithm Based On Knowledge Distillation And Reinforcement Learning

Posted on:2024-08-06

Degree:Master

Type:Thesis

Country:China

Candidate:J Li

Full Text:PDF

GTID:2568307094974379

Subject:Computer Science and Technology

Abstract/Summary:

PDF Full Text Request

With the rapid improvement of hardware level and the convenience of data acquisition,deep convolutional neural networks have been widely used in various fields.However,excellent network is often accompanied by more parameters and calculation,the current common equipment can not meet the requirements of this kind of network model for computing power,which makes the popularization and deployment of excellent model into a dilemma.In order to solve the above problems,this paper introduces attention mechanism,knowledge distillation and reinforcement learning on the basis of the previous network pruning methods to achieve a greater degree of compression of the model,while ensuring the accuracy of the model is not affected.The main research work is as follows:(1)Based on the detailed analysis of various model compression methods,a fusion pruning algorithm based on attention mechanism and knowledge distillation is proposed.This method combines three model compression methods(network pruning,parameter quantization,knowledge distillation)for the first time,and improves the knowledge distillation method among them.Specifically,the attention mechanism is introduced into the teacher network to improve the performance of the teacher network,so as to improve the distillation effect.An interpretable method for weight allocation in knowledge distillation process is proposed,which can guide network training effectively.The experimental results show that the accuracy of the compressed model is more than 4%higher than that of the previous method while the number of parameters is less.(2)The traditional model compression method requires experts in related fields to adopt different strategies according to different situations,which leads to the whole compression process is very time-consuming and the final effect is not ideal.This paper introduces a new state quantity i_t,which can represent the importance of the t-layer network,into the state space of reinforcement learning on the basis of the original automatic model compression method.In this way,the size of rewards obtained can be controlled to achieve a larger pruning of unimportant network layers,and the improved knowledge distillation method is integrated into the process of model compression.Finally,the model obtained through the whole automatic model compression process is more accurate,and the number of parameters and calculation amount of the model after compression are also less.Taking ResNet50 as an example,on CIFAR100 data set,the number of parameters of the compressed model after this method is one fifth less than that of the original automatic model compression method,but the accuracy is improved by nearly 3%.(3)With the help of ONNX,NCNN and other open source frameworks,the mobile terminal deployment of the models before and after compression is carried out,and the performance of the two is compared.The experimental results show that the compressed model takes about half the time of the original model when processing the same image.

Keywords/Search Tags:

Convolutional neural network, Model compression, Knowledge distillation, Attention mechanism, Network pruning, Reinforcement learning

PDF Full Text Request

Related items

1	Research On Convolutional Neural Network Compression Technology Based On Knowledge Distillation
2	Pruning Neural Networks Based On Stochastic Gradient Sparse Optimization
3	Research And Implementation Of Model Compression Method Based On Knowledge Distillation
4	Research On Convolutional Neural Network Compression Algorithm And Application Based On Knowledge Distillation
5	Research On Convolutional Neural Network For Compression Algorithm
6	Research On Model Compression Scheme For Convolutional Neural Network
7	Model Compression Technique For Deep Neural Network
8	Research On Explainable Conpression Algorithm Of Deep Neural Network
9	Research On Object Detection Technology Based On Lightweight Model Compression
10	Research On Model Compression Method Of Deep Convolution Neural Network