Research On Deep Neural Network Compression And Acceleration Based On Parameter Pruning

Posted on:2022-06-22

Degree:Master

Type:Thesis

Country:China

Candidate:Y Q Feng

Full Text:PDF

GTID:2558307109469334

Subject:Computer technology

Abstract/Summary:

PDF Full Text Request

Deep Neural Networks(DNN)have achieved great success in the past few years and have been used for a variety of computer vision tasks.However,the huge storage and computational costs limit the deployment on resource-constrained devices.To address this limitation,many DNN compression methods have been proposed to improve model efficiency,such as parameter quantization,knowledge distillation,low-rank decomposition,parameter pruning and compact network design.Among these methods,network pruning has received wide attention for its excellent performance.Compared to earlier unstructured pruning methods,structured network pruning methods can obtain non-sparse convolutional kernels and thus utilize additional hardware acceleration.Traditional structured pruning methods include vector-level,kernel-level,filter-level and group-level pruning.Since DNN contains numerous parameter information,how to effectively quantify the information and then guide to remove redundant or unimportant parameters is a major difficulty in pruning field.Most structured pruning methods utilize iterative gradient back propagation,which still requires fine-tuning of the pruned network.This iterative optimization mechanism seriously reduces the efficiency of practical applications.At the same time,these methods often neglect the multi-domain information inside DNN,leading to imprecise metrics of parameter information.In order to overcome the challenge of traditional structured pruning methods requiring fine tuning as well as label dependence,to efficiently quantify the complex network information,and to fully consider the information distribution of the filters in different domains,the main work of this paper is as follows:(1)To address the problem of quantifying the importance of information from complex deep network parameters accurately and efficiently,a filter selection method based on K-means clustering is proposed.Calculate the similarity between filters,then the similar filters in the network can be pruned.Experiments prove that the clustering-based pruning method has better performance in terms of accuracy and efficiency.(2)A feature-level network pruning method based on GANs is proposed to address the problem of fine-tuning requiring and label dependence.Through adversarial training,the model pruning is synchronized with training while not relying on labels.Experiments indicate that the combination of GANs effectively improves the pruning performance and inference efficiency.(3)To address the challenge of quantifying multi-dimensional information,a filter structured pruning method based on frequency clustering centers is proposed.The similarity is measured both in spatial and frequency domains.Finally,the designed pruning algorithm is evaluated to demonstrate the feasibility of frequency domain pruning.

Keywords/Search Tags:

Deep neural network compression, generative adversarial network, K-means clustering, structured pruning, compressed video, multi-domain information quantification

PDF Full Text Request

Related items

1	Research On Deep Neural Networks Compression Based On Sensitivity Pruning Method
2	Research On Structured Model Compression Algorithm In Deep Convolutional Neural Network
3	Research On Evolutionary Based Automated Neural Network Compression
4	Research On Deep Neural Network Model Compression Method Based On Structured Pruning
5	Research On Deep Neural Network Compression And Acceleration Based On Channel Pruning
6	Research On Deep Network Compression Method Based On Model Gradient Information
7	Research On Compression Method For Convolutional Neural Network Based On Pruning
8	Research And Application Of Image Compression Algorithm Based On Generative Adversarial Network
9	Research On Compression Method Of Deep Neural Networks
10	Neural Network Acceleration And Compression