Font Size: a A A

Research On Neural Network Compression Method For Edge Computing Platform

Posted on:2021-10-08Degree:MasterType:Thesis
Country:ChinaCandidate:Y P ZhaoFull Text:PDF
GTID:2518306494495554Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the rapid development of Internet and artificial intelligence technology,edge cloud hybrid intelligent computing has become the mainstream of technology,so it is necessary to deploy neural network in both edge and cloud.In the AI scenario,the cloud computing power can be extended to the edge nodes close to the terminal devices by working together with the cloud and the edge.Large amount of data is trained in the cloud to generate AI model,and then the AI model is packaged and deployed to the edge node to run(reasoning).At the same time,the data of the edge node is sent back to the cloud for further training to form a closed loop.However,due to the limitation of computing power,battery life and bandwidth cost,it is difficult to deploy complex neural networks directly.Therefore,in the algorithm level,we need to use model compression,lightweight design and other technologies to reduce the amount of computation and parameter scale,so as to deploy the AI model to the edge computing platform.Edge computing platform is an open platform integrating network,computing,storage and application.It can process and analyze data in real time or faster without frequent data flow,distance and delay.This paper mainly studies the model compression technology for edge computing platform.The main research contents are as follows:For image classification task,we first select a lightweight classification network,and then optimize the network model.Then,we do a series of model compression operations,such as model pruning,knowledge distillation and quantization,to compare the impact of different model compression methods on the model.Finally,a single model compression method is combined to reduce the complexity of the model by multiple compression.For the task of image segmentation,we first use FPN to improve the accuracy of segmentation algorithm,and then the optimized model is mixed compressed.The hybrid compression mainly refers to the search method of ofa neural network,which combines structured search with knowledge distillation,and trains a large number of sub networks at the same time.Reasoning is performed by selecting only a part of the whole network,which can flexibly support different depth and width without retraining.In order to verify the effect of hybrid model compression algorithm,this paper transforms the compressed neural network model into onnx format,implements hardware reasoning acceleration through onnx runtime,and deploys it on Jetson nano,Jetson TX2 and other edge computing platforms.The experimental results show that the proposed hybrid model compression method has better model compression effect.Through hardware reasoning acceleration,the inference speed of the model is further improved,which has high real-time performance.It can be seen that our method has strong generalization ability in multitasking and wider application scope.
Keywords/Search Tags:model compression, edge computing, deep learning, hardware deployment, image classification, image segmentation
PDF Full Text Request
Related items