Research On Optimization Of Deep Learning Model Deployment For Edge Environment

Posted on:2023-05-31

Degree:Master

Type:Thesis

Country:China

Candidate:Y Liu

Full Text:PDF

GTID:2558306914457374

Subject:Electronic and communication engineering

Abstract/Summary:

In recent years,deep neural networks have made continuous breakthroughs in the fields of computer vision and natural language processing.With the advancement of algorithms,the demand for deploying neural network applications in cloud,terminal,edge and other scenarios has gradually increased.The existing deep learning models have high computational complexity and large number of parameters,which pose great challenges for deployment in edge environments such as mobile devices with strict requirements on hardware resources and power consumption.Deep model compression and acceleration technology can greatly reduce the amount of parameters and calculations without losing accuracy,reducing the difficulty of deep model deployment.Facing the edge environment,this paper conducts research from the perspectives of deep model pruning compression and model collaborative inference acceleration.The specific research contents are as follows:(1)An automatic structure pruning algorithm for deep models based on reinforcement learning.Aiming at the selection of pruning standards and pruning rates for each layer in the model pruning process,a filter pruning scheme with joint optimization of pruning standards and pruning rates is proposed.This paper fully considers the pruning sensitivity and the internal relationship between layers,re-establishes the optimization model of filter pruning,and minimizes the accuracy loss after model pruning on the basis of satisfying the target sparsity,using parametrized deep qnetworks algorithm(PDQN)to solve this mixed variable nonlinear optimization problem.The experimental results show that the proposed scheme selects the appropriate pruning standard and pruning rate for each layer under the given target sparsity,which reduces the accuracy loss after model pruning.(2)Pruning algorithm based on spatial and channel attention mechanism.Aiming at the problem of channel importance measurement in the pruning process,this paper proposes a channel importance measurement method based on attention mechanism.Inspired by the attention mechanism can help model pay more attention to important features.By introducing the Spatial Channel Attention(SCA)module on the convolutional layer,the attention score of the output channel can be obtained and deleted according to the attention score.Remove redundant channels.The algorithm combines the pruning process and network training,and introduces an attention module to complete the evaluation of channel importance with less overhead.Experimental results demonstrate that this scheme selects redundant channels according to the attention score and reduces the impact of pruning operations on model accuracy.(3)Research on acceleration of complexity-aware collaborative inference.Aiming at the problems of high latency and unstable communication bandwidth faced by deep model inference in edge environments,this paper proposes a complexity-aware collaborative inference scheme.By adjusting the exit threshold of each early exit branch in the progressive inference process and the model split point in the collaborative inference process,it can cope with the dynamic changes in the edge environment.And the reinforcement learning method is used to optimize the adjustment strategy of the exit threshold and the split point.The experimental results show that the scheme can well adapt to the changes of communication bandwidth and input data complexity,and meet the needs of different types of edge intelligence applications.

Keywords/Search Tags:

model pruning, convolutional neural network, attention mechanism, reinforcement learning, collaborative inference

Related items

1	Research On Model Compression Algorithm Based On Knowledge Distillation And Reinforcement Learning
2	Model Compression Technique For Deep Neural Network
3	Research On Deep Neural Network Model Compression Method Based On Structured Pruning
4	The Study Of Pruning Methods Of Deep Neural Network
5	Research On Pruning Algorithm Of Neural Network Model Based On Reinforcement Learning
6	Research On Image Semantic Segmentation Of Road Scene Based On Convolutional Neural Network
7	Research On Pruning Algorithm Of Classification Model Based On Neural Network
8	Research On Model Pruning Technology For Deep Convolutional Neural Networks
9	Researches On Inference Acceleration Of Convolutional Neural Networks For Object Detection
10	Parallel Inference And Lightweight Methods For Convolutional Neural Networks In Edge Environment