Design And Implementation Of Model Lightweight Service Platform In Open Environment

Posted on:2024-04-05

Degree:Master

Type:Thesis

Country:China

Candidate:L N Fan

Full Text:PDF

GTID:2568306944963339

Subject:Computer technology

Abstract/Summary:

PDF Full Text Request

In recent years,the artificial intelligence technology based on deep learning has been widely used in industrial intelligent manufacturing,medical image analysis,intelligent grid and smart city,bringing profound changes to traditional industries.However,when deep learning technology is applied to practical scenarios,especially in complex open environments with multiple scenes,difficult tasks,and high requirements for crossscenario adaptability,there are still many problems and challenges in the deployment of deep learning models:1.Flexible combination and reuse of multiple models for different task scenarios cannot be realized,leading to poor maintainability and scalability of the system;2.There is a lack of efficient and convenient model quantization schemes,and most hardware platforms only provide the simplest low-bit quantization methods,with limited optimization space,resulting in serious loss of accuracy after quantization;3.Existing AI service platforms mainly support the training stage of neural network models,and support is weak in the model lightweighting and service stage.To address these issues and challenges,this paper starts from the application scenario designing and builds a lightweight model service platform for open environments to support the lightweighting,service,and platformization of deep learning technology for practical applications.The main research contents are as follows:1.Starting from model service deployment,this paper investigates more efficient model service generation methods and proposes a model service workflow arrangement and deployment method based on Directed Acyclic Graph(DAG)structure.By modularizing components such as model training,preprocessing,and postprocessing,the model service workflow is combined in a serial or parallel manner to achieve rapid arrangement and reassembly of models in complex task scenarios,improving the scalability and maintainability of multiple model and multiple processing component combinations in open environments.2.Starting from quantization critical nodes and optimizable space,this paper investigates and implements a lightweight end-to-end neural network model quantization technology.Before quantization,the parameter distribution is adjusted across layers using the inverse ratio decomposition,making the model more suitable for quantization;during quantization,the weights and quantization parameters are optimized jointly layer by layer,improving the accuracy of the model after quantization;after quantization,a layer-wise error analysis-based operator scheduling algorithm is proposed based on the combination of hardware platform operator fusion strategies,achieving quantization acceleration through the combination of software and hardware.Compared to the original IN8 quantization implementation,the designed quantization algorithm can improve accuracy of about 2%on average,providing high-availability model quantization services for resource-constrained edge platforms.3.Finally,starting from the model platform service capability,this paper designs and implements a lightweight model service platform for open environments.Based on the above two research topics,this paper combines container technology and microservice architecture to build an end-to-end lightweight model service platform that provides services such as model quantization,model service workflow arrangement,and model deployment.The aim is to provide technical support for the rapid deployment of deep learning models in open environments with multiple scenarios and tasks.

Keywords/Search Tags:

model componentization, model arrangement, model quantification, layered optimization

PDF Full Text Request

Related items

1	Double-layered Model Predictive Control Algorithm In Industrial SvstemBased On First Principle Model
2	Research On Computational Optimization Technology Based On Deep Learning
3	Research On DCNN Model Compression Strategy For Edge Devices
4	Reserarch About The Layered Queueing Network Model Of The Database System In The Cloud Environment
5	Computation Model And Performance Optimization On Shared Memory Architecture
6	Research And Practice Of Risk Quantification Model Based On Information Security Risk Assessment Guide
7	Research And Application On Quantification Model Of Information Security Risk Assessment
8	Algorithm development for a two-stream irradiance model for layered environmental media
9	Research On A Layered Model Of Affect
10	Object Tracking Based On Multi-Layered Salient Foreground Patches