Research On Convolutional Neural Network Compression Strategy For Edge Computing

Posted on:2020-12-03

Degree:Master

Type:Thesis

Country:China

Candidate:C R Zhong

Full Text:PDF

GTID:2428330596995444

Subject:Computer technology

Abstract/Summary:

PDF Full Text Request

As the Internet of Things(IoT)gradually upgrades the traditional industry,deep learning technology is continuously applied to wearable devices,smart homes,unmanned digital factories,unmanned logistics,and smart medical care,etc.However,it is still difficult to deploy powerful large-scale Convolution Neural Network(CNN)models on the edge devices such as FPGA(Field Programmable Gate Array)and embedded devices,whose hardware resources are limited.Therefore,there is an urgent need to effectively compress the CNN model so that the network model meets the requirements of the edge devices for deployment application.At present,there has been a lot of work and research on the CNN model compression algorithms,with researchers exploring different ways to reduce network redundancy.However,the existing CNN model compression methods still face many challenges when solving the problems of edge computing.They include 1)They are limited to small data sets and network models,2)They only consider one aspect of the requirements and 3)They only implement accelerated reasoning for GPU processors.It is difficult for them to meet the requirements in practical application situations where large-scale CNN models are applied in the edge deployment.Hence,this paper proposes a compression method for edge devices to erase the limitations of the above-mentioned edge devices and the problems of CNN model compression.In order to meet the requirements of the accuracy of the CNN model in the practical situations,as well as the deployment requirements of the model,we propose a search framework to explore the balance between each dimensional redundancies for the redundancy of the CNN model.Eventually,we compress the size of CNN models more greatly while ensuring the accuracy of the task.The main work of this paper includes 1)According to the structures of the large CNN models and the characteristics of its edge devices,we propose a compression method for CNN models towards edge deployment,meanwhile compressing network connectivity and data bit width in two dimensions.Under the premise of ensuring the network accuracy rate,it can effectively compress the CNN models.2)Since the storage and computational hardware resources are too high when deploying the large CNN models,we establish a weight redundancy search framework based on the redundancy analysis of the weight connection in network models.When ensuring the network accuracy rate,we prune the network redundancy weights at a maximum degree.3)For the controversy between the complex floating-point operation and limited computational performance of edge devices such as FPGA and embedded devices,the dynamic fixed-point model is used to quantize the network model.It further analyzes data bit width redundancy in the network models and constructs a redundancy search framework of digital bits.Ensuring the accuracy of the network,it represents the fixed-point type of each part of the data in the network with less bit width.4)Experiments on the large CNN models with different structures are carried out in this paper.They verify that our proposed CNN compression method for edgeoriented applications can effectively compress the network model to satisfy the requirements of edge devices while maintaining network accuracy.

Keywords/Search Tags:

Convolution neural network, Edge computing, Network pruning, Quantization, Network compressing

PDF Full Text Request

Related items

1	Research On Edge Computing Deep Neural Network Optimization Technology Based On Channel Pruning
2	Low Power And Multi-precision Computing Circuits Design For Irregular Network Layers In Neural Networks
3	Research And Application Of Structured Model Compression Algorithm In Deep Neural Network
4	Research On Compressing Deep Neural Network Based On Pruning
5	Research On Deep Neural Network Model Compression Method Based On Parameter Pruning
6	Research On Model Compression And Acceleration For Deep Neural Network
7	Research On Model Compression Method Of Deep Convolution Neural Network
8	Research On Channel Flexible Pruning On Deep Convolutional Neural Networks
9	Study On Convolutional Neural Network Compression Methods Based On Pruning And Quantization
10	Design And Verification Of DNN Compression Algorithm Based On Structure Pruning