In recent years,with the continuous development of science and technology,in the field of single agent,deep reinforcement learning has made many remarkable achievements.However,in the field of multi-agent,due to the increase number of agents and the complexity of the environment,agents cannot deal with the explosive growth of information during interaction.So it is an important research topic how to make agents communicate effectively and select information in the process of learning,so as to promote the collaborative ability between agents.In this paper,effective communication and conflict in the process of communication are studied.The main research contents are as follows:(1)At present,the communication mode of multi-agent is unreasonable,which can not deal with a large amount of information of agent well,it is unable to deal with the large amount of information of the agent.This paper proposes a multi-agent information processing method based on symbolic attention mechanism: this method mainly integrates the traditional multi-agent attention mechanism and symbolizes it,instead of learning the rest indiscriminately like the traditional information processing method Information for all agents.The key of it is to comprehensively consider the similarity between agents,which includes positive similarity and negative similarity,so as to learn the most relevant agent information in a more comprehensive way.This method can not only reduce the amount of information that each agent needs to process in the multiagent system,but also remove a large amount of redundant information irrelevant to itself,and improve the sample efficiency of the agent.Finally,the experimental results of two classical scenarios show that this method can achieve better learning effect,and can obtain higher rate of return in the experiment environment with synchronous length,so that agents can make better decisions.(2)Aiming at the problems of complex modeling and low efficiency of existing agent conflict resolution methods.A multi-agent conflict processing method based on double-deep Q reinforcement learning algorithm(DDQN)is proposed: This method firstly by DDQN algorithm calculation agent’s cumulative returns,according to the cumulative returns gives agent priority order,and make use of the order to make decisions,and then through independent selection decision action,to avoid the effect of the conflict,this method does not like traditional way of dealing with conflict,the formal setting agent,For some complex scenarios,artificial Settings become impossible.This method not only enables an agent to resolve conflicts autonomously,but also eliminates the need for complex environment modeling.Finally,the simulation experiment is carried out by simulating the intelligent vehicle conflict scene in the real world.The experimental results show that the method can deal with the conflict problem better and help the agent to make a better decision. |