Font Size: a A A

Research On Multi-Agent Pursuit-Evasion Based On Deep Reinforcement Learning

Posted on:2021-12-05Degree:MasterType:Thesis
Country:ChinaCandidate:L XuFull Text:PDF
GTID:2518306104487374Subject:Control Science and Engineering
Abstract/Summary:PDF Full Text Request
With the development of modern information technology,the problem of multi-agent pursuit-evasion confrontation game has received more and more attention in military,industrial,agricultural and other fields.At present,most of the research on multi-agent pursuit-evasion countermeasures are from classical control theory.The starting point is to construct a control strategy based on the mathematical model of the agent.But this method ignores the difficulty of establishing an accurate mathematical model for the agent in real life and has certain limitations.Therefore,this paper introduces the method of deep reinforcement learning.By exploring trial and error learning methods,agents can make autonomous decisions in the process of interacting with the environment,and constantly update their own strategies to achieve the purpose of optimal decision-making.Deep reinforcement learning combines the advantages of deep learning and reinforcement learning.It has not only strong feature extraction capabilities,but also excellent autonomous decision-making capabilities.It can directly convert original input data into decision-making outputs,thereby controlling the behavior of agents,which is an artificial intelligence method closer to human thinking.Based on the deep deterministic policy gradient algorithm,the traditional neural network structure is improved in scalability problem.Taking advantage of the dynamic characteristics of the recurrent neural network,a multi-agent pursuit-evasion scalability algorithm is proposed to solve the scalability problem of multi-agents.At the same time,in order to solve the problem of instability or even divergence in deep reinforcement learning training with incomplete information,this paper designs an auxiliary prediction network model(APNM)to infer the environmental state information of other agents.The multiagent pursuit-evasion learning framework(MAPELF)constructed by united APNM solves the problem of incomplete information brought by the observable part of the agent.Finally,according to the specific situation of multi-agent complex environment,the traditional strategy gradient method is improved,and a distributed multi-agent strategy gradient algorithm is proposed to solve the problems of long training time of neural network when there are many agents.The experimental results show that the scalability algorithm proposed in this paper improves the generalization ability of traditional depth deterministic policy gradient algorithms.APNM can solve the problem that agents are difficult to train under incomplete information.Compared with traditional policy gradient algorithms,this paper uses the distributed multi-agent strategy gradient not only shortens the training time,but also improves the stability and performance of the algorithm when there are many agents.Simulation experiments proves the feasibility and effectiveness of the proposed algorithm.
Keywords/Search Tags:Multi-Agent Pursuit-Evasion, Deep Reinforcement Learning, Confrontation Game, Multi Agent Deep Deterministic Policy Gradient
PDF Full Text Request
Related items