Font Size: a A A

Research On Edge Computing Deep Neural Network Optimization Technology Based On Channel Pruning

Posted on:2022-10-13Degree:MasterType:Thesis
Country:ChinaCandidate:M R LiuFull Text:PDF
GTID:2518306563473624Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the advent of the era of Internet of Everything(Io E),with the explosive growth of the number of Internet of Things(Io T)devices,massive real-time data has been generated.Traditional cloud computing approach has been unable to meet the needs of computing,edge computing as a new computing paradigm is widely used.Meanwhile,deep neural network(DNN)technology develops rapidly.The demand for the combination of artificial intelligence(AI)and edge computing is also growing.However,due to the high storage and high power consumption of DNN models,and the limited resources and insufficient endurance of edge devices,edge computing DNN often faces great challenges.First of all,for the scenario of single resource constrained edge device computing DNN independently,how to compress the model to a small enough and ensure the loss of accuracy as small as possible is a big challenge.Morever,due to the limited performance improvement of DNN model by a single edge device,we can consider drawing support from cloud server for collaborative computing.In this scenario,how to accelerate the collaborative inference and select the best partition point according to the application's requirements for time or accuracy are the key problems to be solved.In view of the above problems and challenges,based on the existing research,this paper makes an in-depth research and exploration on the edge computing DNN optimization technology.The main contents and contributions are listed as follows.Firstly,aiming at the scenario of single resource constrained edge device computing DNN independently,this paper proposes a DNN channel pruning algorithm based on attention mechanism.Firstly,this paper designs a dual attention module SCA,which can not only improves the performance of the DNN model,but also generates statistics which can be used as an important criteria to evaluate whether the channel is redundant.Secondly,this paper proposes CPSCA channel pruning algorithm with the guidance of SCA attention mechanism.The experimental results show that the SCA attention module has the best structure design,compared with other attention mechanisms,it is not only a lightweight module,but also can bring more significant accuracy improvement.In addition,compared with other pruning algorithms,under the condition of the same pruning ratio,the compressed model obtained by CPSCA pruning algorithm has the highest accuracy.Secondly,aiming at the scenario of edge devices drawing support from cloud server for collaborative computing DNN,this paper proposes an efficient edge cloud collaborative inference approach,ECCI.Firstly,aiming at the problem of resource limitation and long computing delay of edge devices,this paper proposes using CPSCA channel pruning algorithm to compress the edge side model.Secondly,in order to solve the problem of long transmission delay of intermediate data,a "three-step compression" strategy is proposed to compress the size of intermediate output data to be transmitted.Finally,in order to meet the specific requirements of application for delay or accuracy,an optimal partition point selection algorithm is proposed.The experimental results show that this approach can significantly reduce the computing delay of edge devices and the transmission delay of intermediate data,and accelerate the collaborative inference process of edge-cloud.Compared with other computing approaches,it shows obvious advantages of delay and accuracy.
Keywords/Search Tags:Edge Computing, Deep Neural Network, Attention Mechanism, Channel Pruning, Model Compression, Edge Cloud Collaboration
PDF Full Text Request
Related items