Font Size: a A A

Research And Implementation Of Deep Neural Network Inference System Based On Edge-Cloud Collaboration

Posted on:2023-09-15Degree:MasterType:Thesis
Country:ChinaCandidate:J N ChenFull Text:PDF
GTID:2568306914474194Subject:Information and Communication Engineering
Abstract/Summary:PDF Full Text Request
The development of deep learning technology has brought new changes to many fields,and deep neural networks have become an indispensable tool for artificial intelligence.There is also a growing need for models to be deployed at the edge to achieve low latency and protect privacy in many business scenarios.However,limited by the computing power of edge devices,the reasoning speed of DNN models at the edge is much slower than in the cloud.The current edge-cloud architecture system cannot well solve the problem of low inference efficiency caused by the model sinking to the edge.This paper proposes an edge-cloud collaborative deep neural network model inference system,which aims to extend the deployment of DNN models to edge nodes.By cutting and deploying the model,the edge and the cloud can cooperate to complete the inference calculation to reduce the inference delay.The main contents of this paper are as follows:(1)DNN model partition and deployment algorithm for edge-cloud collaborative reasoning.In the traditional edge-cloud collaborative reasoning system,the three stages of reasoning are completely serialized.This paper proposes a parallel model cutting and deployment strategy for edge reasoning,data transmission and cloud reasoning.First,the linear model LEM of FLOPs and inference time is proposed to evaluate the computing power of the device.Based on this,the cutting algorithm of the chain structure and DAG structure model is proposed.The DNN model is divided into multiple branches and then cut and distributed to achieve parallel deployment.Reasoning,improve the efficiency of reasoning.(2)Optimization of data transmission mechanism in edge-cloud collaborative reasoning.The QUIC protocol is deeply integrated into the edge-cloud collaborative reasoning system to improve the data transmission efficiency between the edge and the cloud.Using the characteristics of QUIC,for the data generated by the inference of the middle layer of the DNN model,a multi-stream transmission mechanism and a data pre-transmission mechanism are proposed,which effectively reduce the transmission cost of the inference data in the middle layer of the DNN model,and an adaptive congestion control strategy is proposed.to adapt to changes in the network.Through these strategies,the delay caused by data transmission in collaborative reasoning is significantly reduced.(3)The realization of the edge-cloud collaborative reasoning system.According to the above algorithms and strategies,an edge-cloud collaborative DNN inference system is implemented.The system implements functions such as automatic evaluation of equipment computing power,automatic analysis of models,and adaptive cutting of models.
Keywords/Search Tags:DNN Inference, DNN Partition, Edge-Cloud Collaboration, QUIC
PDF Full Text Request
Related items