Research On Multi-agent Cooperation Strategy Based On Reinforcement Learning

Posted on:2021-03-15

Degree:Master

Type:Thesis

Country:China

Candidate:C Liang

Full Text:PDF

GTID:2428330602979278

Subject:Control engineering

Abstract/Summary:

PDF Full Text Request

How to apply the intensive learning method to accomplish specific tasks in a multi-agent environment has always been a difficult point in the field of reinforcement learning.Effective communication and coordination between multiple agents is an important means to move toward general artificial intelligence.At present,many traditional reinforcement learning algorithms can realize single agent learning in a simple environment.However,in a multi-agent environment,due to the complexity and dynamic nature of the environment,the learning process encounters great difficulties,and dimensional explosions occur.Target rewards are difficult to determine,and the algorithm is unstable and difficult to converge.This paper introduces a multi-agent reinforcement learning method based on improved DDPG.By improving the DDPG model combined with bidirectional cyclic neural network and comparing with other algorithms,the improved algorithm is in terms of convergence speed and task completion.There is a significant improvement.The main research contents of this topic are as follows:(1)Summarize the research status of traditional reinforcement learning(single agent reinforcement learning)and multi-agent reinforcement learning algorithm at home and abroad,and introduce the model structure of classical algorithm and the basic knowledge of game theory applied in multi-agent environment.And propose a multi-agent reinforcement learning algorithm based on multi-intelligent communication;(2)For two more advanced communication-based multi-agent reinforcement learning algorithms MADDPG and BiCNet proposed in recent years,according to the existing experimental environment,redefine environmental rewards and tasks,respectively,in different environments Experiments are carried out,and the advantages and their limitations are analyzed respectively through the experimental results.Combined with the optimization methods of these two algorithms,an improved algorithm based on DDPG algorithm is proposed.(3)In order to solve the problem that the performance of the first two algorithms is low and difficult to adapt to different environments,the Mi-DDPG(Mixed Deep Deterministic Policy Gradient)algorithm is first added to the two-way circular network in the Actor network as the information layer of the homogeneous agent.The heterogeneous agent information is added to the Critic network to learn the multi-agent collaboration strategy.In addition,in order to alleviate the training pressure,the centralized training and distributed execution framework are adopted,and the Q function in the Critic network is modularized.This not only improves the performance and execution efficiency of the algorithm,but also ensures the generalization ability of the algorithm in different environments.(4)In the experiment,the Mi-DDPG algorithm is compared with other algorithms in different scenarios.Mi-DDPG has obvious improvement in convergence speed and task completion degree,and it has potential value in real world application.

Keywords/Search Tags:

reinforcement learning, deep learning, multi-agent, RNN, DDPG, Actor-Critic

PDF Full Text Request

Related items

1	Research On Multi-agent Cooperative Learning Based On Deep Reinforcement Learning
2	Exdloratory Action Correction Algorithm Based On Actor-Critic
3	The Research On Symbolic Regression Based On Reinforcement Learning
4	Aero-engine Intelligent Control Based On Reinforcement Learning
5	Reaearch On Deep Reinforcement Learning Algorithm In Continuous Action On Space
6	Research On Target Tracking Algorithm Based On Deep Learning And Reinforcement Learning
7	Obstacle Avoidance Control For Multi-agent Systems Based On Deep Reinforcement Learning
8	Research On RFID Indoor Positioning Algorithm Based On Deep Reinforcement Learning
9	Research On Multiagent Cooperation And Applications Based On Reinforcement Learning
10	Deep Reinforcement Learning With Experience Replay