Research On Elastic Channels-based Dynamic Neural Network Architecture

Posted on:2024-05-18

Degree:Master

Type:Thesis

Country:China

Candidate:H Z Wang

Full Text:PDF

GTID:2568307067494474

Subject:Electronic information

Abstract/Summary:

PDF Full Text Request

In recent years,deep learning techniques have made breakthroughs in areas such as autonomous driving,speech recognition,natural language processing,and medical image processing,opening a new era of deep neural network-based intelligent applications in the Internet of Things.As the performance of neural network models gradually increases,the parameter size of these models also gradually scales up,imposing a huge overhead on resource-constrained Io T devices and generating excessive latency in intelligent applications.To reduce the performance consumption of models,researchers have been devoted to designing lightweight deep neural network models,but existing work has encountered two bottlenecks: one is the significant accuracy degradation caused when compressing model parameters,and the other is the difficulty in balancing the resource consumption and performance of models on heterogeneous devices.This paper systematically investigates how to design and deploy lightweight deep neural network models.Specifically,we first design a channel-changeable dynamic neural network architecture,which contains multiple independent subnets with different parameter sizes and inference arithmetic overheads to achieve a trade-off between resource consumption and precision level at runtime.Then,we design an in-place distillation and frozen updating-based training method to enhance the training quality of the dynamic neural network model.Finally,we construct a sliding update-and feature caching-based inference strategy for improving the stability of inference response speed and reducing the redundant computation when switching sub-networks.The main contributions of this paper are as follows:1.We design a channel-changeable dynamic neural network architecture,which contains multiple sub-networks with different performance and can be applied to mainstream deep neural network models.The model consists of a feature extraction part and a classification part.We first build the feature extraction part using the strategy of head convolution sharing and branching incremental concatenation,and then build the feature classification part based on scalable shared neural layers.2.We propose an in-place distillation and frozen updating-based training method.Inplace distillation uses the largest sub-network as the teacher model and guides the network as the student model to learn the features and train the sub-network with better performance.The frozen updating mechanism suppresses the mutual perturbation of different sub-networks in the back-propagation of the shallow layers of the network and improves the overall performance of the model.3.In this paper,a sliding update-based adaptive routing decision maker is constructed.In the inference preparation stage,the decision maker completes the threshold initialization for all sub-networks in the dynamic model with a single round of information entropy collection on the training set,achieving a trade-off between accuracy level and inference computation.In the inference phase,the decision maker instantly adjusts the threshold decisions of each sub-network based on the delay feedback through a sliding update mechanism,thus stabilizing the inference time of the model under different workloads and preset delay requirements and maximizing the model accuracy level.4.We adopt a feature caching-based inference mechanism.This mechanism decouples the forward propagation process of multiple sub-networks,and it eliminates redundant computations in the feature extraction stage of the latter sub-networks by caching the intermediate feature tensor to speed up inference.The experimental results show that the dynamic neural network architecture proposed in this paper can reduce the computational effort by 22.4% to 25.0% and accelerate the inference response by 10.5% to 34.7% compared with the mainstream methods in two public datasets while maintaining similar accuracy.

Keywords/Search Tags:

Dynamic Neural Network Architecture, subnets, Branch Fusion, Feature Sensing, Adaptive Inference

PDF Full Text Request

Related items

1	Research On Dynamic Inference Technique Of Deep Neural Networks Based On Branch Execution
2	The Research Of Remote Sensing Scene Classification Based On Convolutional Neural Network
3	Research On Intelligent Spectrum Sensing Algorithm For Mobile Communication Based On Machine Learning
4	Remote Sensing Image Segmentation Algorithm Based On Adaptive Feature Fusion Network
5	Research On Deepfake Face Video Detection Algorithms Based On Muiti-feature Fusion
6	Research On Object Detection Algorithm Based On Feature Fusion And Adaptive Anchor Framework
7	Research On Neural Network Architecture Of Component Adaptive
8	Adaptive Face Anti-Spoofing Algorithm Based On Multimodal Feature Fusion
9	Research On Cloud Inference Based Medical Images Fusion Methods
10	Study On Person Re-identification Based On Multi-Branch Feature Fusion And Camera Style Conversion