Font Size: a A A

Similarity-Based Approach To Neural Network Pruning

Posted on:2021-07-25Degree:MasterType:Thesis
Country:ChinaCandidate:Y X ZhangFull Text:PDF
GTID:2518306503480704Subject:Electronics and Communications Engineering
Abstract/Summary:PDF Full Text Request
Faced with radically increased computation and storage requirements for deep convolutional neural networks(CNNs)in recent years,researchers have proposed various methods to perform model compression and acceler-ation,including low-rank factorization,network pruning,weight quantiza-tion,neural architecture search and knowledge distillation.Among these ap-proaches,the algorithms based on network pruning are of particular research interests given their favorable balance between simplicity and capacity.Despite their impressive performance,existing network pruning algo-rithms more or less suffer from some inconveniences in practical application scenarios.For instance,the necessity of training from scratch with sparsity regularization or complex data-driven optimization and the dependence on specific hardwares&libraries or pre-defined target architectures.To address aforementioned issues once and for all,we set out from a novel perspective to explore parameter redundancy and accelerate deep CNNs.Based on the intuition that channels revealing similar feature infor-mation are highly correlated and have functional overlap,we argue that each such similarity group can be effectively reduced to a few representatives with little impact on the representational power of the network.After introducing two theoretically well-founded metrics to respectively evaluate structural fil-ter similarity and channel similarity,we develop an efficient channel-level pruning framework based on hierarchical clustering and1-norm channel selection.In particular,the proposed similarity-based pruning algorithms can be directly applied to all kinds of pre-trained CNN models for better tradeoffs between latency and accuracy.Moreover,rather than relying on pre-defined target architectures,they can automatically discover resource-efficient structures out of the original model under given budgets,which can be regarded in part as performing neural architecture search.Compared with previous magnitude-based approach to network pruning,the proposed similarity-based approach is more general in that it makes no assumptions of the distribution of model weights or activation values,such as large devia-tion and small minimum,and therefore remains applicable in wider practical application scenarios.Comprehensive experimental results strongly demonstrate the superior performance of our approach over prior arts.Besides regular image classifi-cation experiments on benchmark datasets and representative CNN architec-tures,we further extend the proposed method to a highly compact and hard-to-prune CNN model,MobileNet V2,and a GAN-based generative model,DCGAN,to showcase its generalization potential and flexibility in real-life applications.
Keywords/Search Tags:Redundant Feature Information, Model Compression and Acceleration, Channel-Level Neural Network Pruning, Deep Learning
PDF Full Text Request
Related items