Font Size: a A A

Combination Compatibility Analysis For Optimizing Parallel Inference Of Deep Learning Models

Posted on:2024-09-20Degree:MasterType:Thesis
Country:ChinaCandidate:M W XieFull Text:PDF
GTID:2568306923457124Subject:Artificial intelligence
Abstract/Summary:PDF Full Text Request
The widespread use of deep learning has facilitated significant improvements in deep learning models characterisation capabilities,driving the boom in the AI industry in the current round.Deep learning models have achieved significant success in natural language understanding and other application fileds,such as ChatGPT.This success is mainly attributed to real-time inference decision-making.Specifically,deep learning models are pre-trained on server clusters based on pre-processed datasets,generating deep learning models that are loaded according to different task requests,and providing decision responses after real-time inference calculations.Real-time inference and decision-making of deep learning models are key factors in determining the efficiency and performance of deep learning applications,and the main challenge is the deep learning models parallel inference computation problem.Different task requests on the same server require different deep learning models.For example,driverless applications mainly call Convolutional Neural Network(CNN)models,while machine translation applications mainly call Transformer models.Therefore,a hot and difficult area of research in deep learning inference decisions is how to optimally combine different deep learning models to make full use of server computing resources such as Graphics Processing Units(GPUs)and improve the efficiency of parallel execution of deep learning models for parallel inference decisions.In this paper,analyzing and studying the intrinsic relationship between the optimization of deep learning models combination and the improvement of parallel inference performance,and proposing a combination compatibility analysis for optimizing parallel inference of deep learning models.Firstly,empirical study on parallel inference of deep learning models reveals that the higher the compatibility between different deep learning models,the higher the computational efficiency of parallel real-time inference,and proposes the "compatibility analysis theory of deep learning models",and derives the compatibility order of mainstream deep learning models CNN,Recurrent Neural Network(RNN),Transformer,Generative Adversarial Network(GAN)and Graphical Neural Network(GNN),providing data science and judgment basis for combination optimization.Secondly,an NP-complete(NPC)analysis method is proposed for the NPC property of the combinatorial optimization problem.Through this method,it is proved that the deep learning models combination problem is an NPC problem,providing a reference for designing heuristic algorithms.Thirdly,according to the NPC property of the deep learning models combination problem,a combination optimization strategy algorithm based on compatibility analysis is proposed,so as to implement real-time scheduling of parallel inference for different models and different task requests through deep learning models compatibility analysis,which effectively improves the efficiency of parallel inference calculation for deep learning models combination.To validate this proposed combination compatibility analysis for optimizing parallel inference of deep learning models,a deep learning inference cluster is built for three typical deep learning applications,namely,computer vision,speech recognition and natural language understanding.The experiments of parallel inference of different deep learning models based on MNIST,SQuAD,and Cora datasets are conducted,and the compatibility order of mainstream deep learning models CNN,RNN,Transformer,GAN and GNN is obtained.The experimental results show that the proposed algorithm can improve the computational efficiency of parallel inference by about 20%compared with other comparative algorithms.To further verify the credibility and practicality of this paper’s research,a parallel inference experimental platform supporting deep models combination is developed,which provides more reliable support for academic research and practical applications in related fields.
Keywords/Search Tags:GPU, deep learning models, parallel inference, combination optimization, compatibility
PDF Full Text Request
Related items