Font Size: a A A

Deep Reinforcement Learning Based Multi-agent Formation Methods

Posted on:2021-11-14Degree:MasterType:Thesis
Country:ChinaCandidate:Z Y HongFull Text:PDF
GTID:2518306476952439Subject:Pattern Recognition and Intelligent Systems
Abstract/Summary:PDF Full Text Request
With the rapid development of artificial intelligence technology,multi-agent systems have been widely used in civil,military and other fields.Compared with the single agent system,the multi-agent system can complete tasks with higher efficiency through collaborative cooperation,and has better adaptability and fault tolerance.Formation technology is a core capability of multi-agent systems to complete tasks.Traditional formation control methods rely heavily on the models of the environment and the agents,as well as the computing resources.Thus they have poor scalability,and are difficult to adapt to some complex tasks combined with obstacle avoidance and navigation tasks.Based on the deep reinforcement learning algorithm,this thesis implements a multi-agent formation with autonomous obstacle avoidance and cooperative cooperation in a complex environment with multiple influencing factors and multiple objectives.The main research work and innovations of the this thesis are as follows:(1)Obstacle avoidance is an important basis for the realization of multi-agent formation in complex environments.This thesis designs an obstacle avoidance method based on Deep Deterministic Policy Gradient(DDPG)algorithm.A Partially Observable Markov Environment suitable for reinforcement learning methods was built.Through proper design of experimental scenarios and reward functions,the obstacle avoidance problems of single and multiple agents were modeled.With the idea of independent reinforcement learning,the DDPG algorithm is used to train the agent.Experimental results show that this method can achieve autonomous obstacle avoidance for both single agent and multiple agents.The high success rates of obstacle avoidance scenarios indicate the effectiveness of reinforcement learning based approach for these problems.(2)Aiming at the problem of multi-agent formation in different application scenarios,this thesis designs a formation method based on Multi-Agent Deep Deterministic Policy Gradient(MADDPG)algorithm.Based on the simulation environment,multiple formation experiment scenarios and corresponding reward functions were designed.Considering that DDPG algorithm is difficult to meet the requirements of collaborative cooperation in formation scenarios,the idea of multi-agent reinforcement learning is used,and a centralized training and decentralized execution frame is adopted.Agents trained by this method can complete a variety of complex tasks such as polygon formation,formation navigation,and switching formation.This multi-agent reinforcement learning based method performs better than the independent reinforcement learning algorithm,showing the superiority of multi-agent deep reinforcement learning algorithm in multi-agent formation problem.(3)Reinforcement learning can help agents learn better formation and obstacle avoidance strategies through continuous interaction with the environment,but there are still shortcomings such as unstable and time-consuming training.In response to these shortcomings,this thesis proposes an Asynchronous Multi-Agent Deep Deterministic Policy Gradient(AMADDPG)algorithm.Considering the slow and difficult convergence of reinforcement learning algorithms,an asynchronous training framework is built in parallel computing style,which improves the convergence speed of the algorithm.Considering the problem of insufficient experience playback,the prioritized data sampling method is used,including the use of prioritized experience replay buffer and prioritized batch data,which improves the efficiency of network parameter updating.Comparative experiments in the multi-agent formation scenario confirmed that the AMADDPG algorithm successfully improved the convergence speed and training effect for reinforcement learning algorithm.
Keywords/Search Tags:multi-agent system, formation, obstacle avoidance, deep reinforcement learning
PDF Full Text Request
Related items