Font Size: a A A

Research On Edge-device Cooperative Inference Mechanism Based On Neural Network Model Simplification And Partition

Posted on:2022-07-03Degree:MasterType:Thesis
Country:ChinaCandidate:Q Y ZhouFull Text:PDF
GTID:2518306605969639Subject:Master of Engineering
Abstract/Summary:PDF Full Text Request
In recent years,benefit from breakthroughs in deep learning technology,artificial intelligence applications and services have flourished.At the same time,the number of mobile phones and edge devices is increasing every year.Hundreds of millions of devices are connected to the Internet,generating a large amount of data at the edge of the network.With the explosive growth of mobile data and the improvement of edge computing capabilities,the concept of edge intelligence is receiving great attention.It emphasizes the use of computing and storage resources at the edge of the network to provide mobile devices with low-latency and private artificial intelligence services.However,for complex tasks,the scale of well-trained deep learning models is very large,and edge devices usually have weak computing power,limited storage capacity and energy,it is usually not feasible to deploy deep learning models on resource-constrained mobile devices and perform timely and reliable inference locally.Therefore,how to streamline the DL model and achieve rapid reasoning is the key challenge for pushing artificial intelligence to the edge.With the advantages of model compression and model partitioning technologies that enable rapid inference of edge intelligent applications,this thesis proposes an edge-end collaborative inference framework that combines neural network model pruning and model partitioning.It aims to use edge terminals and edge cloud collaborative computing to promote the effective inference of deep learning models.Under this collaborative inference framework,the work of this thesis is carried out from the following two aspects:Aiming at the problem that resources and energy-constrained Io T devices cannot directly handle computationally intensive intelligent tasks,this thesis proposes a two-step model simplification algorithm based on convolution kernel pruning,which reduces the model scale by removing the redundant parameters of the neural network.The principle of pruning is to approximate the changes in the loss function before and after the parameter pruning by Taylor expansion,so as to prune the "unimportant" convolution kernel according to the importance.The algorithm first prunes the entire neural network,and then prunes the specific layers separately to achieve two-step simplification of the model.Simulation experiments show that the de-redundant deep neural network has achieved better results in maintaining accuracy,reducing execution delay and reducing the amount of layer output data.Aiming at the problem that the deployment of neural network models on edge terminals with limited resources cannot achieve rapid inference,based on the use of two-step model simplification algorithm to compress the neural network model,this thesis further divides the model appropriately and offloads some computing tasks to Edge cloud,so as to realize edge-end collaborative inference.From the perspective of reducing the end-to-end delay,ensuring the accuracy of inference,and improving the quality of model division,this thesis proposes an online on-demand model optimal segmentation point selection algorithm.Simulation experiments show that the selection of division points under different performance requirements is affected by factors such as wireless bandwidth and edge computing resource differences,the algorithm reasonably allocates the entire deep neural network model to the edge cloud and edge terminals for execution,realizing the balance of transmission and calculation workload between the edges,and further proving the effectiveness of the collaborative inference framework.Finally,based on the image classification application,a simple system prototype is implemented for the collaborative inference framework proposed in this thesis.
Keywords/Search Tags:Edge Intelligence, Model Pruning, Model Partition, Edge-device Cooperative Inference
PDF Full Text Request
Related items