Research On Cooperative Coverage Method Of Multi-agent Based On Deep Reinforcement Learning

Posted on:2024-09-04

Degree:Master

Type:Thesis

Country:China

Candidate:H P Yuan

Full Text:PDF

GTID:2568307151460664

Subject:Computer Science and Technology

Abstract/Summary:

PDF Full Text Request

In recent years,with the maturity and rapid development of AI technology,the methods represented by deep reinforcement learning have made breakthroughs in the field of single agent,while there are many team collaboration tasks in real life,so researchers have gradually applied deep reinforcement learning to the field of multi-agent.The multi-agent cooperative coverage problem is one of the most critical problems in the multi-agent field.In the multi-agent cooperative coverage task,the agent not only observes its own local information,but also observes the global information of other agents,resulting in slow convergence speed of the agent,unstable algorithm models,and other issues.Aiming at the above multi-agent cooperative coverage problem,this paper mainly carried out the following research:First,to address the problems of low efficiency and slow learning speed in multi agent collaborative coverage environment exploration,this paper proposes a multi-agent cooperative exploration framework based on adaptive noise policy(ANPEF).Firstly,an adaptive noise policy network(ANPN)is proposed,which can adaptively and dynamically update parameters based on the number of iterations of the agent and make action decisions based on the observed state of the agent.Then,combined with the centralized trainingdecentralized execution framework,the multi-agent cooperative exploration framework ANPEF based on adaptive noise policy is proposed,which realizes the in-depth exploration of multi-agent complex environment and reduces the possibility of falling into local optimization.Finally,the effectiveness and universality of the proposed framework were studied,and the ANPEF framework was applied to MAAC and MADDPG algorithms,respectively.The ANPEF-MAAC and ANPEF-MADDPG algorithms were proposed to solve the problem of incomplete and unstable exploration of the environment by the multiagent cooperation coverage model.Then,to address the problems of the low utilization rate of multi-agent experience samples and the poor team rewards,this paper proposes an experience replay buffer mechanism algorithm based on meta-learning error classification(MECER).Firstly,an error classification experience replay mechanism algorithm(ECER)is proposed,which divides the experience pool into recommendation pool and common experience pool,and stores them separately according to the TD error of the experience sample as the importance standard.Then,a meta learning error classification experience replay mechanism MECER is proposed.which uses the meta learning idea as a starting point to dynamically adjust the sampling ratio parameters of the historical knowledge learning experience pool and improve the utilization rate of high-quality samples.Finally,MECER is applied to MAAC and MADDPG algorithms to solve the problems of insufficient utilization of experience samples and slow convergence speed in multi agent cooperative coverage tasks.Finally,two algorithms incorporating the ANPEF framework and MECER framework are implemented in the two cooperative coverage tasks of cooperative communication and cooperative navigation in the abstraction multi particle environment,and are compared with existing algorithms,and the ablation comparison experiments are conducted on ECER and MECER to verify the effectiveness of the algorithm in this paper.

Keywords/Search Tags:

Multi agent collaborative coverage, Environmental exploration, Centralized training-decentralized execution framework, Experience replay mechanism

PDF Full Text Request

Related items

1	Research On Multi-agent System Decision Algorithm Based On Deep Reinforcement Learning
2	Research On Multi-agent Roundup Strategy Based On Reinforcement Learning
3	Research On Experience Replay Method For Deep Reinforcement Learning
4	Research On Goal-Conditioned Hierarchical Multi-Agent Reinforcement Learning For Cooperative Environment
5	Research On Online Network Resource Allocation Method Leveraging Multi-agent Deep Reinforcement Learning In Edge Network
6	Research On Multi-agent Attack And Defense Countermeasures Based On Deep Reinforcement Learning
7	Research On Multi-Agent Cooperative Algorithm Based On Deep Reinforcement Learning
8	Research On Knowledge Sharing And Exploration Mechanism In Multi-agent Reinforcement Learning
9	Research On Optimization Methods Of The Experience Replay Mechanism For Off-policy Reinforcement Learning
10	Research On Multi-agent Collaboration Based On Deep Reinforcement Learnin