Font Size: a A A

Research On Adaptive Technology For Deep Learning Model Inference In Dynamic Edge Environment

Posted on:2024-08-17Degree:MasterType:Thesis
Country:ChinaCandidate:Y ZhangFull Text:PDF
GTID:2568306944463594Subject:Computer technology
Abstract/Summary:PDF Full Text Request
Deep learning models have been widely applied in various scenarios,such as providing perception information for commanders in emergency rescue.However,it is still a great challenge to efficiently run deep learning model inference on devices.Due to the limitations of computation capability,storage space,network bandwidth,and energy consumption on devices,traditional cloud-based inference methods cannot meet the requirements of real-time,privacy,and reliability in edge scenarios.To address these issues,this paper studies the adaptive inference techniques of deep learning models under dynamic edge environments and discusses them from two aspects:single-node device pruning inference and device-edge collaborative inference.Firstly,on single-node device devices,this paper proposes a categoryadaptive pruning inference method under dynamic resources.This method aims to adapt the relationship between model pruning rate and device resources and task demands.Specifically,this method predicts the inference energy consumption under different model pruning rates by data-driven energy consumption modeling;designs category attention mechanism to improve the accuracy of specific categories;and solves the resource-demand optimization model to obtain the optimal pruning rate.Experimental results show that this method can achieve efficient and accurate inference under the constraint conditions of device resources and task demands.Secondly,in the collaborative inference mode between device and edge,this article proposes an inference method based on high compression ratio features and compensation under dynamic communication conditions.The method aims to solve the problems of feature compression and compensation in deviceedge collaborative inference.Specifically,this framework uses a heavily parameterized encoder with a high compression ratio to transmit very low data volume features;uses a feature consistency decoder to utilize latent effective information gap to ensure model accuracy;and designs a bandwidth-aware adaptive feature configuration module to dynamically select appropriate compression parameters.Experimental results show that it has significant advantages in perceiving targets with low transmission latency and high inference accuracy.Finally,in emergency scenarios where commanders need to quickly and effectively use edge computing resources for inference,this paper implements a dynamic edge depth adaptive inference system for emergency scenarios.This system can select different nodes for inference in the single-node device or device-edge collaborative mode according to additional task requirements,and effectively manage the whole inference process.And deployed the single-node device pruning inference algorithm and device-edge collaborative inference algorithm proposed in this paper,and conducted system function tests and nonfunction tests.Test results show that the proposed algorithm and system have high efficiency,reliability,and adaptability.
Keywords/Search Tags:emergency scene, dynamic edge environment, model pruning, device-edge collaborative, feature compression
PDF Full Text Request
Related items