Multi-agent Coordinated Control Technology Based On Reinforcement Learning

Posted on:2022-05-24

Degree:Master

Type:Thesis

Country:China

Candidate:Y S Jiang

Full Text:PDF

GTID:2518306494986429

Subject:Computer technology

Abstract/Summary:

PDF Full Text Request

Reinforcement learning has become a powerful method to solve the problem of ordered decision making with the help of deep learning.Reinforcement learning refers to the process in which agents constantly interact with the environment to improve their own strategies and maximize their own returns.Multi-agent reinforcement learning,as a branch of reinforcement learning,studies reinforcement learning from the perspective of multiple agents,and has a wide range of applications in many fields,such as traffic flow control,multiplayer confrontation games,automatic driving,etc.,which is becoming the focus of reinforcement learning research.In this paper,the algorithm and application of reinforcement learning in the field of multi-agent are studied.The research contents are as follows:� This paper improves the commonly used algorithms of multi-agent reinforcement learning.In multi-agent reinforcement learning,agents are easily influenced by other agents and the training environment,which leads to the problem that the strategy quality of agents is not high and the convergence speed is slow.this paper propose a novel algorithm,-Maximum Critic Multi-Agent Deep Deterministic Policy Gradient algorithm(-M2DDPG),which leverages a new critic technique called -Maximum Critic to banlance the exploitation and exploration in updating Q-value function.Furthermore,we propose -Maximum Attention Critic Multi-Agent Deep Deterministic Policy Gradient(-MA2DDPG)algorithm in order to improve computation efficiency inspired by the attention idea.We empirically evaluate our algorithms in five kinds of mixed cooperative and communication environments.These experimental results demonstrate that our algorithms significantly accelerates the learning process and outperform existing baseline algorithm MADDPG.� The paper will strengthen the application of algorithms and multi-group epidemic prevention and control,optimize mobile intervention strategies,reduce intervention costs,and control the spread of infectious diseases.This method dynamically divides the population into 5 categories according to the individual's status information,and uses a reinforcement learning algorithm to learn an effective strategy for each group to obtain the largest collective reward,so as to make each group cooperate to learn the optimal strategy,and finally experiment results also show that the algorithm is better than the existing benchmark algorithm.

Keywords/Search Tags:

Multi-agent, Reinforcement learning, Maximum Commentator, Epidemic Control

PDF Full Text Request

Related items

1	Research On Deep Reinforcement Learning Technology For Multi-agent Collaboration
2	Research On Multi-agent Reinforcement Learning Method Based On Stein Variational Gradient Descent
3	Study Of Multi-agent Learning Problem Based On Reinforcement Learning
4	Research And Application Of Reinforcement Learning In Multi-agent Collaboration
5	Research Of Multi-Agent Reinforcement Learning And Its Application
6	Research On Multi-agent Cooperation Method Based On Deep Reinforcement Learning
7	Decentralized Multi-agent Reinforcement Learning Algorithm Research
8	Cooperation Promotion Multi-Agent Reinforcement Learning
9	Supervised Reinforcement Learning:methods And Applications
10	The Research On Reinforcement Learning Based On Cooperative Multi-agent