Similarity-Based Approach To Neural Network Pruning

Posted on:2021-07-25

Degree:Master

Type:Thesis

Country:China

Candidate:Y X Zhang

Full Text:PDF

GTID:2518306503480704

Subject:Electronics and Communications Engineering

Abstract/Summary:

PDF Full Text Request

Faced with radically increased computation and storage requirements for deep convolutional neural networks(CNNs)in recent years,researchers have proposed various methods to perform model compression and acceler-ation,including low-rank factorization,network pruning,weight quantiza-tion,neural architecture search and knowledge distillation.Among these ap-proaches,the algorithms based on network pruning are of particular research interests given their favorable balance between simplicity and capacity.Despite their impressive performance,existing network pruning algo-rithms more or less suffer from some inconveniences in practical application scenarios.For instance,the necessity of training from scratch with sparsity regularization or complex data-driven optimization and the dependence on specific hardwares&libraries or pre-defined target architectures.To address aforementioned issues once and for all,we set out from a novel perspective to explore parameter redundancy and accelerate deep CNNs.Based on the intuition that channels revealing similar feature infor-mation are highly correlated and have functional overlap,we argue that each such similarity group can be effectively reduced to a few representatives with little impact on the representational power of the network.After introducing two theoretically well-founded metrics to respectively evaluate structural fil-ter similarity and channel similarity,we develop an efficient channel-level pruning framework based on hierarchical clustering and₁-norm channel selection.In particular,the proposed similarity-based pruning algorithms can be directly applied to all kinds of pre-trained CNN models for better tradeoffs between latency and accuracy.Moreover,rather than relying on pre-defined target architectures,they can automatically discover resource-efficient structures out of the original model under given budgets,which can be regarded in part as performing neural architecture search.Compared with previous magnitude-based approach to network pruning,the proposed similarity-based approach is more general in that it makes no assumptions of the distribution of model weights or activation values,such as large devia-tion and small minimum,and therefore remains applicable in wider practical application scenarios.Comprehensive experimental results strongly demonstrate the superior performance of our approach over prior arts.Besides regular image classifi-cation experiments on benchmark datasets and representative CNN architec-tures,we further extend the proposed method to a highly compact and hard-to-prune CNN model,MobileNet V2,and a GAN-based generative model,DCGAN,to showcase its generalization potential and flexibility in real-life applications.

Keywords/Search Tags:

Redundant Feature Information, Model Compression and Acceleration, Channel-Level Neural Network Pruning, Deep Learning

PDF Full Text Request

Related items

1	Research On Compression And Acceleration Of Deep Neural Network Based On Model Pruning
2	Acceleration,Compression And Evaluation Methods On Deep Neural Networks
3	Research On Compression And Acceleration For Deep Convolutional Network Model
4	Research Of The Model Compression Algorithm For Deep Neural Network
5	Research On Deep Neural Network Model Compression Method Based On Parameter Pruning
6	Development Of Model Compression And Inference Acceleration Algorithms Of Image Super Resolution Deep Neural Networks
7	On The Learning And Compression Of Deep Neural Network Structure
8	Research On Compression And Acceleration Of Deep Convolutional Neural Networks
9	The Study Of Pruning Methods Of Deep Neural Network
10	Model Compression And Forward Acceleration Based On Embedded Deep Neural Network