Deep reinforcement learning methods have made remarkable achievements in the fields of natural sciences,engineering,medicine,and operations research in recent years.In a more general reality scenario,the number of agents that need to be controlled by intelligent algorithms is often more than one.Therefore,multi-agent reinforcement learning is attracting more and more attention from academia and industry.However,there are two unavoidable shortcomings in the current centralized training decentralized executing multi-agent reinforcement learning algorithm framework: 1)Maintaining a centralized controller is not acceptable in terms of cost or security.2)Since the information of the remaining agents needs to be used to assist training of the agent,it is not able to obtain good scalability in a large-scale multi-agent scene.Aiming at the above problems,this paper proposes a generalized fully decentralized multi-agent reinforcement learning framework and a sparse attention based multiagent reinforcement learning algorithm.The latter is a special case when the constraint term is sparse constraint in the decentralization framework.Specific work includes:1)In order to remove the existence of the central controller during multi-agent reinforcement learning algorithm training,this paper uses the actor-critic algorithm as the basis of the decentralization framework and jointly optimize the actors and critics.The consensus constraint is introduced in the above joint optimization problem to enable some parameters to be shared among independent agents to improve the generalization performance of the framework.Finally,the optimization algorithm based on the primal-dual mixed gradient is used to solve the above problem of separable optimization with linear constraints and a general fully decentralized multi-agent reinforcement learning framework is proposed.2)Considering the scalability of the multi-agent reinforcement learning algorithm in large-scale multi-agent scenarios,this paper introduces the attention mechanism into the multi-agent reinforcement learning algorithm and applies sparse constraints to the activation function that calculates the attention weight.Therefore,a multi-agent reinforcement learning algorithm based on sparse attention mechanism is proposed.Finally,experiments in eight different scales and different types of multi-agent simulation environments show that the fully decentralized framework proposed in this paper is superior to the baselines in terms of versatility,algorithm convergence speed and final algorithm performance;The multi-agent reinforcement learning algorithm based on the sparse attention mechanism as a special case of the framework is also superior to the baselines in terms of algorithm scalability,final performance and algorithm interpretability. |