Font Size: a A A

Research On Convolutional Neural Network Compression Strategy For Edge Computing

Posted on:2020-12-03Degree:MasterType:Thesis
Country:ChinaCandidate:C R ZhongFull Text:PDF
GTID:2428330596995444Subject:Computer technology
Abstract/Summary:PDF Full Text Request
As the Internet of Things(IoT)gradually upgrades the traditional industry,deep learning technology is continuously applied to wearable devices,smart homes,unmanned digital factories,unmanned logistics,and smart medical care,etc.However,it is still difficult to deploy powerful large-scale Convolution Neural Network(CNN)models on the edge devices such as FPGA(Field Programmable Gate Array)and embedded devices,whose hardware resources are limited.Therefore,there is an urgent need to effectively compress the CNN model so that the network model meets the requirements of the edge devices for deployment application.At present,there has been a lot of work and research on the CNN model compression algorithms,with researchers exploring different ways to reduce network redundancy.However,the existing CNN model compression methods still face many challenges when solving the problems of edge computing.They include 1)They are limited to small data sets and network models,2)They only consider one aspect of the requirements and 3)They only implement accelerated reasoning for GPU processors.It is difficult for them to meet the requirements in practical application situations where large-scale CNN models are applied in the edge deployment.Hence,this paper proposes a compression method for edge devices to erase the limitations of the above-mentioned edge devices and the problems of CNN model compression.In order to meet the requirements of the accuracy of the CNN model in the practical situations,as well as the deployment requirements of the model,we propose a search framework to explore the balance between each dimensional redundancies for the redundancy of the CNN model.Eventually,we compress the size of CNN models more greatly while ensuring the accuracy of the task.The main work of this paper includes 1)According to the structures of the large CNN models and the characteristics of its edge devices,we propose a compression method for CNN models towards edge deployment,meanwhile compressing network connectivity and data bit width in two dimensions.Under the premise of ensuring the network accuracy rate,it can effectively compress the CNN models.2)Since the storage and computational hardware resources are too high when deploying the large CNN models,we establish a weight redundancy search framework based on the redundancy analysis of the weight connection in network models.When ensuring the network accuracy rate,we prune the network redundancy weights at a maximum degree.3)For the controversy between the complex floating-point operation and limited computational performance of edge devices such as FPGA and embedded devices,the dynamic fixed-point model is used to quantize the network model.It further analyzes data bit width redundancy in the network models and constructs a redundancy search framework of digital bits.Ensuring the accuracy of the network,it represents the fixed-point type of each part of the data in the network with less bit width.4)Experiments on the large CNN models with different structures are carried out in this paper.They verify that our proposed CNN compression method for edgeoriented applications can effectively compress the network model to satisfy the requirements of edge devices while maintaining network accuracy.
Keywords/Search Tags:Convolution neural network, Edge computing, Network pruning, Quantization, Network compressing
PDF Full Text Request
Related items