Font Size: a A A

Research On Lossless Compression Technology For Neural Networks

Posted on:2020-07-03Degree:MasterType:Thesis
Country:ChinaCandidate:C X WangFull Text:PDF
GTID:2428330575458132Subject:Microelectronics and Solid State Electronics
Abstract/Summary:PDF Full Text Request
The calculation of deep neural networks is a computationally intensive and memory-intensive task.Had becoming a research hotspot,the model compression can reduce both energy consumption,memory usage and bandwidth required for its calculation.So far,the main focus of model compression is on pruning and quantization.This paper explores another perspective-the compression ratio that can be achieved using lossless compression and sparse matrix storage based on pruning and quantization.In this paper,a model compression evaluation framework is constructed.The framework is extensible,and it can automatically complete the compression ratio test of the compression algorithms for specified data,when the algorithms to be tested and the test data are prepared according to certain input and output rules.This paper enumerates the general lossless compression algorithm and sparse matrix storage algorithm,and analyzes the advantages and disadvantages.On this basis,six kinds of algorithms to be tested are selected,and 36 algorithm combinations are formed.Using the evaluation framework,we test these combinations on Resnet18 and Mobilenet models,which has been pruned and quantized.It is found that the entropy coding has the highest compression ratio for fine-grained pruning,but the combinations of two entropy coding gain minor increase.In the sparse matrix storage algorithm,the bitmap achieves a good compression ratio,and the computational complexity is much lower than the entropy coding.In this paper,the parameters of the two models are further differentiated(find the difference between adjacent parameters),and the model sparsity is greatly increased.In the case where other conditions have not changed,the compression ratio can be increased by 1.3×,which proves that there are available data structures in the model that can be used for compression.Compare with the CSC,commonly used in neural network accelerators,we find that the simpler differential + bitmap algorithm can improve the compression ratio by 1.5×,while the more complex scheme differential+ Huffman can provide more than 1.7× the increase,which can greatly reduce the transmission bandwidth and storage requirement of the model.
Keywords/Search Tags:Neural Network, Model Compression, lossless compression, sparse matrix storage
PDF Full Text Request
Related items