Study And Implementation Of Computational Graph Physical Placement Algorithm Based On Deep Learning

Posted on:2023-09-05

Degree:Master

Type:Thesis

Country:China

Candidate:Y Li

Full Text:PDF

GTID:2530306914973319

Subject:Software engineering

Abstract/Summary:

PDF Full Text Request

Deep learning technology has been widely used in image,speech,natural language processing and other fields.With the continuous expansion of the scale of neural networks and datasets,how to improve the operating efficiency of the model has become a huge challenge.In recent years,the emergence of multi-core Artificial Intelligence chips based on pipeline parallel mode provides an effective solution to this problem.How to deploy neural network models on the physical cores of multi-core Artificial Intelligence chips has become a valuable research question.The key to this problem is how the computational graph representation of the neural network is placed on multi-cores.The main research content of this thesis is how to generate a placement on multi-cores for the subgraphs formed by the division of the computational graph,so as to reduce the cost of inter-core communication.In this thesis,the computational graph placement problem is first described in an abstract way and modeled as a Markov decision process.Secondly,placement algorithms based on deep reinforcement learning methods such as REINFORCE,DQN and PPO are designed to optimize the devision process,and these three algorithms achieve high-quality placement on small-scale chips.In view of the limitations of the above three single-agent mode algorithms in large-scale chip placement,an improved placement algorithm based on asynchronous mode Ape-X and an improved placement algorithm based on asynchronous mode APPO are designed.In order to further shorten the training time,distributed training is tried.Finally,based on the algorithm research results,a prototype system of computational graph physical cores placement is developed,which realizes the functions of placement model training,placement scheme generation and placement results display.The main contributions and innovations of this thesis are as follows:(1)An environment description method for the computational graphs placement on physical cores is defined,and an action constraint strategy named CORES-MASK is designed to dynamically update the range of optional actions for the feature that the same core cannot be selected repeatedly in the placement environment;(2)Multi-agent parallel sampling under the asynchronous architecture effectively increases the randomness and richness of the data and increases the upper limit of the reward during the training process.This solves the limitation that it is difficult to learn effectively in the single-agent sampling mode;(3)A reward-based two-end priority sampling algorithm is proposed.Taking the reward of each trajectory as the priority,the data is sorted.Train data is extracted from both ends according to the priority in the process of training,so as to maximize the probability of selecting high reward actions and reduce the probability of low reward actions being selected.The experimental results shows that this sampling method effectively improves the slow rise of early rewards in the training process of APPO algorithm,and also the final convergence effect.

Keywords/Search Tags:

Computational Graph Placement, Deep Reinforcement Learning, Multi-core chip, Off-policy Sampling

PDF Full Text Request

Related items

1	Deep Reinforcement Learning With Exploratory Noise
2	Research On Complex Games Based On Deep Reinforcement Learnin
3	Research On Monitoring Method Of Multi-scale Cyclone Based On Deep Reinforcement Learning Algorithm
4	Researches On Combinatorial Optimization Methods In Basis Of Deep Reinforcement Learning
5	Research And Application Of Imperfect Game Strategy Based On UCT Algorithm And Deep Reinforcement Learning
6	Research On Multi-level Inverted Pendulum Balance Control Based On Deep Reinforcement Learning
7	Research Of Ocean Observation Path Planning Based On Reinforcement Learning
8	Maintenance Optimization Of Complex Engineering Systems By Deep Reinforcement Learning
9	Distributed Multi-task Learning Algorithms Based On Graph Signal Processing
10	State Estimation And Policy Learning In Partially Observable Markov Decision Processes