Multi-agent Confrontation Algorithm Based On Reinforcement Learning

Posted on:2022-11-19

Degree:Master

Type:Thesis

Country:China

Candidate:S N Hou

Full Text:PDF

GTID:2518306743951469

Subject:Control Engineering

Abstract/Summary:

PDF Full Text Request

Multi-agent system(MAS)refers to a computerized system composed of multiple agents that can interact with the environment.Because deep reinforcement learning has strong exploration and decision-making capabilities,deep reinforcement learning technology has become the mainstream method for intelligent decision-making in multi-agent system.With the continuous development of artificial intelligence technology,multi-agent reinforcement learning has been widely used,and the problem of collaborative confrontation has strong research value.The deep reinforcement learning research on the problem of multi-agent collaborative confrontation aims to obtain the optimal strategy to achieve the goal through the interaction between the agent formation and the environment.The deduction of the multi-agent collaborative confrontation environment is affected by the execution of all agents' actions.Due to the large number of agents and the existence of agents that are not controlled by one's own side,the environment is complex,dynamic and unstable.And because the complexity of the multi-agent system increases with the increase of the number of agents,a huge exploration space will be generated,and the strategy is dynamically changed based on it,which makes the experience playback sample inefficient.The above problems have seriously affected the performance of deep reinforcement learning algorithms on MAS.This paper reviews the historical development of multi-agent reinforcement learning,and combines existing work to conduct research.The main research content of this paper includes the following two parts:(1)For complex dynamic and unstable environmental problems,an unknown agent is proposed.Multi-agent collaborative confrontation algorithm for behavior prediction.The main body of the algorithm adopts a value decomposition network structure,combines supervised learning and reinforcement learning,and innovatively adds an unknown agent behavior prediction module.The unknown agent behavior prediction module builds and trains a supervisory auxiliary model based on the historical characteristics and execution actions of the unknown agent to predict the actions of the unknown agent.The value decomposition network merges the output of the prediction module with environmental state information to make intelligent decisions.Experiments show that the algorithm performs better than the current mainstream baseline algorithm in the SMAC Star Craft II environment and the Ma CA formation confrontation environment.(2)Aiming at the low efficiency of multi-agent experience playback samples,a robust multi-agent reinforcement learning experience playback multi-layer construction method is proposed.This method has a three-level structure for experience playback.First,the storage method of the experience playback buffer pool is improved by the reservoir algorithm,and then the sample set that is conducive to encouraging exploration is screened out by the similarity measurement screening method,and finally performed on the basis of this set.Importance sampling based on policy changes improves the stability and credibility of the sample.Experiments prove that the method has good performance in both SMAC environment and MaCA environment.

Keywords/Search Tags:

reinforcement learning, multi-agent system, behavior prediction, experience replay, importance sampling

PDF Full Text Request

Related items

1	Research On Experience Replay Method For Deep Reinforcement Learning
2	Research On Deep Reinforcement Learning Technology For Multi-agent Collaboration
3	Research On Experience Replay In Deep Reinforcement Learning
4	Research On Optimization Methods Of The Experience Replay Mechanism For Off-policy Reinforcement Learning
5	Research And Implementation On Game Control Algorithm Based On Deepening Reinforcement Learning
6	Experience Replay In Multi-Agent Deep Reinforcement Learning
7	Research Of Multi-agent Cooperation Based On Deep Reinforcement Learning
8	Deep Reinforcement Learning With Experience Replay
9	Research On Optimization Method Of Deep Reinforcement Learning Experience Replay
10	Improvement And Application Of Deep Reinforcement Learning Based On Experience Replay Mechanism