Font Size: a A A

Research And Implementation Of Deep Convolutional Neural Network Model Compression Algorithm

Posted on:2021-05-23Degree:MasterType:Thesis
Country:ChinaCandidate:K YangFull Text:PDF
GTID:2428330623473610Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the advent of the information age,the wave of artificial intelligence is rapidly rising.As the basis of artificial intelligence,neural networks have developed rapidly in recent years.Although the performance of neural networks has reached a very high level,the status quo of high resources and high energy consumption still exists.On the other hand,with the rise of the concept of edge computing,the number of edge devices is also increasing exponentially.However,edge devices are characterized by low resources and low energy consumption.This runs counter to the current state of neural networks.Therefore,it is of great significance to perform compression research on existing neural network models and then deploy them on edge devices.The core of this paper is the deep convolutional neural network model compression,and it is finally deployed on edge devices with limited resources.The main research contents are:1.Research on Deep Neural Network Compression Algorithm(Pruning Method)The pruning method is one of the mainstream methods of deep neural network model compression at present,but how to set different pruning rates for each layer with uneven information distribution is still an open question.This paper proposes a layer recovery sensitivity analysis method,which can more accurately explore the contribution of each layer to the network performance.According to the difference of contribution,a hierarchical pruning rate method is proposed to classify the contribution and set the pruning rate of each layer more accurately.The experiments show that the hierarchical pruning method is compared with the FPGM pruning method with the best pruning effect at present,and the classification accuracy of the pruning model can be improved steadily under the condition that the basic pruning ratio is basically the same;Next,the model is able to crop to a higher scale.2.Research on Comprehensive Compression of Deep Convolutional Neural Network ModelAt present,there are many methods for deep neural network model compression.This paper analyzes and researches the existing compression technology,selects the VGG-16 deep convolutional neural network model,and uses multiple methods to comprehensively compress the network.The final compression results: on the premise of ensuring that the performance loss of the deep neural network model does not exceed 3%,the calculation amount of the compression model is compressed by more than 45%,the amount of parameters is compressed below 10%,and the model storage size is reduced below 10%.The energy consumption ratio on the smart terminal chip reaches 1T(FLOPs / w).3.Research on rapid deployment of comprehensive compression model at the edgeHeterogeneous edge computing has become a new computing paradigm,with the goal of moving computing from the cloud to the edge.The rapid deployment of edge computing tasks is critical to the full implementation of edge computing.In this paper,an extensible edge-side rapid task deployment method is designed,which uses a crossplatform container technology to encapsulate the software environment of the heterogeneous computing unit and related dependencies required for runtime to achieve rapid deployment of edge-side experimental tasks.A large number of experiments show that the task deployment platform will hardly cause additional time consumption and will not occupy additional computing resources.
Keywords/Search Tags:neural network, model compression, pruning, task deployment, comprehensive compression
PDF Full Text Request
Related items