Combination Compatibility Analysis For Optimizing Parallel Inference Of Deep Learning Models

Posted on:2024-09-20

Degree:Master

Type:Thesis

Country:China

Candidate:M W Xie

Full Text:PDF

GTID:2568306923457124

Subject:Artificial intelligence

Abstract/Summary:

PDF Full Text Request

The widespread use of deep learning has facilitated significant improvements in deep learning models characterisation capabilities,driving the boom in the AI industry in the current round.Deep learning models have achieved significant success in natural language understanding and other application fileds,such as ChatGPT.This success is mainly attributed to real-time inference decision-making.Specifically,deep learning models are pre-trained on server clusters based on pre-processed datasets,generating deep learning models that are loaded according to different task requests,and providing decision responses after real-time inference calculations.Real-time inference and decision-making of deep learning models are key factors in determining the efficiency and performance of deep learning applications,and the main challenge is the deep learning models parallel inference computation problem.Different task requests on the same server require different deep learning models.For example,driverless applications mainly call Convolutional Neural Network(CNN)models,while machine translation applications mainly call Transformer models.Therefore,a hot and difficult area of research in deep learning inference decisions is how to optimally combine different deep learning models to make full use of server computing resources such as Graphics Processing Units(GPUs)and improve the efficiency of parallel execution of deep learning models for parallel inference decisions.In this paper,analyzing and studying the intrinsic relationship between the optimization of deep learning models combination and the improvement of parallel inference performance,and proposing a combination compatibility analysis for optimizing parallel inference of deep learning models.Firstly,empirical study on parallel inference of deep learning models reveals that the higher the compatibility between different deep learning models,the higher the computational efficiency of parallel real-time inference,and proposes the "compatibility analysis theory of deep learning models",and derives the compatibility order of mainstream deep learning models CNN,Recurrent Neural Network(RNN),Transformer,Generative Adversarial Network(GAN)and Graphical Neural Network(GNN),providing data science and judgment basis for combination optimization.Secondly,an NP-complete(NPC)analysis method is proposed for the NPC property of the combinatorial optimization problem.Through this method,it is proved that the deep learning models combination problem is an NPC problem,providing a reference for designing heuristic algorithms.Thirdly,according to the NPC property of the deep learning models combination problem,a combination optimization strategy algorithm based on compatibility analysis is proposed,so as to implement real-time scheduling of parallel inference for different models and different task requests through deep learning models compatibility analysis,which effectively improves the efficiency of parallel inference calculation for deep learning models combination.To validate this proposed combination compatibility analysis for optimizing parallel inference of deep learning models,a deep learning inference cluster is built for three typical deep learning applications,namely,computer vision,speech recognition and natural language understanding.The experiments of parallel inference of different deep learning models based on MNIST,SQuAD,and Cora datasets are conducted,and the compatibility order of mainstream deep learning models CNN,RNN,Transformer,GAN and GNN is obtained.The experimental results show that the proposed algorithm can improve the computational efficiency of parallel inference by about 20%compared with other comparative algorithms.To further verify the credibility and practicality of this paper’s research,a parallel inference experimental platform supporting deep models combination is developed,which provides more reliable support for academic research and practical applications in related fields.

Keywords/Search Tags:

GPU, deep learning models, parallel inference, combination optimization, compatibility

PDF Full Text Request

Related items

1	Research On Learning And Inference Methods In Deep Generative Models
2	Research On Parallel Computing Of Deep Learning Inference Service System
3	Research On Deep Bayesian Latent Variable Models And Their Inference Methods
4	Deep Generative Models For Various Learning Tasks
5	Researh On The Edge-cloud Collaborative Inference Acceleration Framework For Deep Learning Models
6	Data Preprocessing Parallel Algorithm Implementation And Performance Optimization For Deep Learning
7	Research And Implementation Of Data Parallel Training Optimization Methods For Deep Learning Models
8	Bias Mitigation For Deep Models At The Inference Stage
9	Research On Acceleration Technology For Deep Learning Inference Based On Multi-core And Many-core Platforms
10	Efficient Bayesian Inference Algorithms In Deep Learning