Research And Optimization Of Convolutional Neural Network Model Compression Method

Posted on:2023-05-26

Degree:Master

Type:Thesis

Country:China

Candidate:Q Zhang

Full Text:PDF

GTID:2568306761998499

Subject:Mathematics

Abstract/Summary:

PDF Full Text Request

With the widespread use of convolutional neural networks in various fields,the total number of parameters is growing exponentially,and the consequent high consumption of storage costs and huge increase in computational costs are seriously affecting their further use and deployment on embedded and mobile devices.Therefore,this paper studies the problem of convolutional neural network compression,and the main contents are as follows:(1)The concept of "edge point" was innovatively defined,and a convolutional neural network compression method based on network pruning and weight quantization was proposed.Caffe framework was used to test the compression effect of the method on Conv1 layer and Conv2 layer in Le Net network using Le Net network model on MNIST data set.The experimental results show that the Conv1 layer is compressed from 1.95 KB to 0.069 KB,about28 times.The Conv2 layer is compressed from 97.66 KB to 3.06 KB,about 32 times.The Conv1 and Conv2 layers are simultaneously compressed from 99.61 KB to 3.13 KB,about 32 times.The accuracy of the model remains basically unchanged.(2)Based on network pruning and weight quantization,Huffman coding technique is introduced to improve the proposed convolutional neural network compression method to further enhance the compression effect by reducing the storage space of non-zero weight indexes.Caffe framework is used to test the compression effect of the proposed method on Conv1 layer and Conv2 layer in CIFAR-10 network by using CIFAR-10 network model of Caffe framework.The experimental results show that the Conv1 layer is compressed to 0.481 KB from 9.3KB to0.507 KB,and the weight index bits are compressed from 3Bit to 2.72 Bit and the compression effect doubles from 18.3 to 19.3 times.The Conv2 layer is compressed to 3.66 KB from 100 KB to 3.85 KB,and the weight index bits are compressed from 4Bit to 3.16 Bit,and the compression effect is doubled from 26 to 27.3 times.The Conv1 and Conv2 layers are simultaneously compressed to 4.17 KB from 109.38 KB to 4.35 KB,and the compression effect is doubled from25.1 to 26.2 times.The model accuracy remains basically unchanged.To sum up,in order to reduce the number of network parameters and improve the efficiency of network computing,this paper proposes a convolutional neural network compression method,and carries out experimental demonstration and analysis of the results.The results show that the proposed method is feasible and effective for at least two convolutional neural network models.

Keywords/Search Tags:

Convolutional neural network, Model compression, Network pruning, Weight quantization, Huffman coding

PDF Full Text Request

Related items

1	Research And Improvement Of Model Compression Method Based On Deep Convolutional Neural Network
2	Research And Application Of Model Compression Based On Convolutional Neural Network
3	Pruning-Based Compression Method For Convolutional Neural Network
4	Research On Convolutional Neural Network Compression Method Based On Dynamic Pruning And Weight Resetting
5	Study On Convolutional Neural Network Compression Methods Based On Pruning And Quantization
6	Research On Convolutional Neural Network Model Compression Based On Pruning
7	Convolutional Neural Network Compression By Fusing Weight And Filter Pruning
8	Research And Implementation On Deep Convolutional Neural Network Compression Algorithm
9	The Research On Algorithm Optimization Of Convolutional Neural Network Model Compression
10	Research On Lightweight Neural Network Data Compression Coding For Vision Terminal