In multi-agent system, it's impossible to build prior model with full knowledge when the environment is dynamic and the actions of other agents are unknown. And much knowledge is obtained when the agents action with each other. The agents in complicated environment should check their actions according prior experiment, that is they should be ability to learn and adapt. So the technology of learn is very important in the application of multi-agent system. At the same time, single agent usually can't complete the complicated mission because of its limited resource and ability, so the cooperation of several agents seems to be very necessary. Adding study mechanism is one of the effective method for the agents to cooperate. On one hand, adding learning mechanism study to multi-agent system can help them cooperate availably. On the other hand, it also can help to improve the ability of multi- agent learning.This article looks back agent and multi-agent system creations and research foundation, multi-agent system learning methods first, and introduces the essence knowledge of multi-agent cooperation, reinforcement learning and reinforcement learning in multi-agent system. In general multi-agent system, the agents update the action policy to make its private reward maximize.We combine co-evolution with evolutionarily stable genetic algorithm, apply it to multi-agent systems which turn into co-evolutionary multi-agent systems(CEMAS). In multi-agent system which combines of two or more population, each population represents a agent. Each agent of CEMAS evolves sequentially and uses evolutionarily stable genetic algorithm repeat to maximize the full utility of the system. In this paper we use dispersed game, the fitness of the system will maximize only when each agent of the system chooses different action. So the agents in dispersed game will choose action dispersedly. The test shows that this algorithm is suitable for multi-agent system, and it improves the optimization of the global fitness in less generation.Reinforcement learning is one of the most useful method in multi-agent learning. The balance between exploration and exploitation of policy decides whether the agent explores the unused policy or exploits the actions have already got in the learning process. The existing action selection policy just makes use of the knowledge of the learning process at present, few strategy involves to make use of the strategy get in the learning process of past. For making use of the prior knowledge, raising the cooperation ability of the agent, we adopt policy reuse method under the frame of stochastic game and reinforcement learning. We save the policies of the solved tasks, and use them to new task, the test shows that it can improve the efficiency of the new task and the reward of the system.This article gives two methods of co-evolutionarily stable genetic algorithm and selection policy, which prove adaptive to the cooperation learning ability of multi-agent system. |