Font Size: a A A

Research On Model Compression Algorithm Based On Knowledge Distillation And Reinforcement Learning

Posted on:2024-08-06Degree:MasterType:Thesis
Country:ChinaCandidate:J LiFull Text:PDF
GTID:2568307094974379Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the rapid improvement of hardware level and the convenience of data acquisition,deep convolutional neural networks have been widely used in various fields.However,excellent network is often accompanied by more parameters and calculation,the current common equipment can not meet the requirements of this kind of network model for computing power,which makes the popularization and deployment of excellent model into a dilemma.In order to solve the above problems,this paper introduces attention mechanism,knowledge distillation and reinforcement learning on the basis of the previous network pruning methods to achieve a greater degree of compression of the model,while ensuring the accuracy of the model is not affected.The main research work is as follows:(1)Based on the detailed analysis of various model compression methods,a fusion pruning algorithm based on attention mechanism and knowledge distillation is proposed.This method combines three model compression methods(network pruning,parameter quantization,knowledge distillation)for the first time,and improves the knowledge distillation method among them.Specifically,the attention mechanism is introduced into the teacher network to improve the performance of the teacher network,so as to improve the distillation effect.An interpretable method for weight allocation in knowledge distillation process is proposed,which can guide network training effectively.The experimental results show that the accuracy of the compressed model is more than 4%higher than that of the previous method while the number of parameters is less.(2)The traditional model compression method requires experts in related fields to adopt different strategies according to different situations,which leads to the whole compression process is very time-consuming and the final effect is not ideal.This paper introduces a new state quantity i_t,which can represent the importance of the t-layer network,into the state space of reinforcement learning on the basis of the original automatic model compression method.In this way,the size of rewards obtained can be controlled to achieve a larger pruning of unimportant network layers,and the improved knowledge distillation method is integrated into the process of model compression.Finally,the model obtained through the whole automatic model compression process is more accurate,and the number of parameters and calculation amount of the model after compression are also less.Taking ResNet50 as an example,on CIFAR100 data set,the number of parameters of the compressed model after this method is one fifth less than that of the original automatic model compression method,but the accuracy is improved by nearly 3%.(3)With the help of ONNX,NCNN and other open source frameworks,the mobile terminal deployment of the models before and after compression is carried out,and the performance of the two is compared.The experimental results show that the compressed model takes about half the time of the original model when processing the same image.
Keywords/Search Tags:Convolutional neural network, Model compression, Knowledge distillation, Attention mechanism, Network pruning, Reinforcement learning
PDF Full Text Request
Related items