Font Size: a A A

Research And Application Of Neural Network Model Compression Based On Weight Pruning

Posted on:2021-02-16Degree:MasterType:Thesis
Country:ChinaCandidate:L H ZhongFull Text:PDF
GTID:2428330623968337Subject:Engineering
Abstract/Summary:PDF Full Text Request
In recent years,due to the excellent feature extraction capabilities,deep neural networks(DNNs)have made breakthrough progress in computer vision,natural language processing and other fields.The rapid development of DNNs has also continuously driven its wide application in the engineering fields.However,the performance of DNNs has also brought problems such as increased model computational complexity and dense model storage,limiting them to be deployed by resource constrained embedded devices.Weight pruning is to solve the above problems as an effective method of model compression.The work of this paper on weight pruning is mainly summarized as follows:1.The paper introduces the basic framework of model compression,and conducts some survey,including current research situation of model compression,and the achievements and challenges of existing methods.2.Adaptive Weight Pruning based on Interval Threshold Search(AWP-ITS)and its improved algorithm are proposed.Most current pruning algorithms use pruning rates to guide model pruning,which requires constantly trying different pruning rates to obtain the best model.It is time-consuming and inefficient.In order to solve this problem,the algorithm proposed in this paper uses model accuracy to guide model pruning.The algorithm sets a threshold for accuracy drop.It not only ensures that the accuracy loss of the model is within a given range,but also supports to adaptively adjust the model accuracy with the requirements of engineering applications.The algorithm first sets equal interval values in the range of parameter extreme values.Then grid search method is used to find the optimal pruning threshold with acceptable accuracy drop.Based on the proposed algorithm,Adaptive Weight Pruning based on Binary-combined Interval Threshold Search(AWP-BITS)is further proposed,which combines the method of binary search and improves the model parameter update strategy.It can further improve the effectiveness of the existing algorithm.3.Frequency-domain Weight Pruning based on Discrete Cosine Transform(FWPDCT)is proposed.Most of the current pruning algorithms are performed in the spatial domain,and the potential interrelationship between model parameters is sometimes difficult to be reflected in the spatial domain.In order to solve this problem,we learn from the successful experience of Discrete Cosine Transform(DCT)in image compression,and propose FWP-DCT.The algorithm first transforms the parameters in the spatial domain to the frequency domain by DCT.Then proposed Interval Threshold Search method is used to find the best pruning threshold in the frequency domain and the model is pruned with pruning threshold.Considering the influence of input data on the importance of model parameters,Data-driven Frequency-domain Pruning algorithm is proposed.The algorithm proves the consistency of convolution operations in the spatial domain and product operations in the frequency domain.It strengthens the correlation between model parameters and input data.Also,the update of the parameters can be realized in the frequency domain.The paper uses PyTorch deep learning framework to test the effectiveness of proposed algorithms in three neural networks.The results of experiments are compared and analyzed.Finally,the pruned models are stored with sparse format,validating the engineering application value of proposed algorithms.
Keywords/Search Tags:Deep Neural Networks, Model Compression, Weight Pruning, Interval Search, Discrete Cosine Transform
PDF Full Text Request
Related items