Among the research of multi-robot system, multi-robot coordination and cooperation are hot issues. Multi-robot pursuit game is an ideal platform for the research of multi-agent robotics cooperation and coordination,which is the theme that investigates the optimal cooperation pursuit algorithms through multi-robot cooperation and coordination in the dynamic process of many predators to capture many evaders. And the research covers many disciplines and domain knowledge, such as real-time vision process, wireless communication, real-time path planning, multi-robot distributed coordination and control, planning and learning, competition and cooperation within among robot groups, and so on.Reinforcement learning is a learning method that to learn the mapping from the state to the action to obtain the best reward signals which is meant by the number. Reinforcement learning is used in multi-robot pursuit game, which can make predators explore the environment actively, to obtain knowledge by interaction with the environment and improve the system ability constantly. Through learning to accumulate experience, robots can recognize the distance between its ability and object's ability, then to improve its actions to enhance the pursuit efficiency.Guided by the exploration of cooperation mechanism with multi-robot pursuit game background, and aimed to improve the cooperation efficiency of collective robots, the dissertation makes a research on the cooperation and coordination algorithms of bounded rational agents in dynamic multi-agent systems. The main details are as following:Firstly, we present to apply multi-agent reinforcement learning to solve multi-robot pursuit game. According to the pursuit task requirement to analysis the method of cooperation union composition, the approach of association rule data mining is introduced to complete task assignment. Through comparing the task attributes and the ability attributes of agents, the concrete pursuit teams can be built. Since rewards have difference in different states, a sub-paragraph reinforcement learning is presented to explore the optimal cooperation pursuit strategies. The complexity of reinforcement learning increases exponentially with the increase of agent number. To avoid the so-called'curses of dimensionality', the approach to decrease the scale of multi-agent system is put forward. Two multi-agent reinforcement learning approach based on task planning and case reasoning respectively are given, which are the theory foundation for learning the optimal pursuit strategies in unknown environment.Secondly, the optimal cooperative capture problem in known environment is investigated. This dissertation extends master-slave cooperation mechanism. Through breaking apart the pursuit area, the load of the system is reduced. Then the preferred function is used to choose the team members to make up of pursuit teams. The next time position of the evader is forecast, which is helpful to action choice of predators. Based on shortcomings of the above approach, a better multi-robot pursuit algorithm is given. The data set of attribute relationship is built by consulting all of factors about capturing evaders, the pursuit teams can be built by Apriori algorithm. Finally, according to the difference of action rewards in robots, a sub-paragraph reinforcement learning is proposed to learn the optimal cooperation pursuit strategy. Thirdly, the optimal cooperative capture problem in unknown environment is investigated. This dissertation firstly apply the circle searching method for target searching, after finding the targets, based on the theory foundation of task decomposition and allocation, the integral planning is used to build cooperative teams. Every cooperative team learns independently, whose members take the best response actions in the light of other agents actions in the same condition, after many repeated games, the aim root could be found. Because of other agents influence, the process of learning is supervised periodically, then through changing the learning rate to gain the right learning results. Because the results of task decomposition and allocation is rough, the integral planning is a complex problem, and ability to complement among predators is not considered, a multi-robot cooperation pursuit algorithm with a reinforcement learning based case reasoning is put forward. When predators pursuit evaders, they will refer to history information and current states of all robots to decide their actions. Experimental results show that our proposed algorithm can remarkably enhance task complete ratio under complex environments.Fourthly, simulation system of multi-robot cooperative pursuit problem is developed. This system provides an experimental platform for deep research on multi-robot pursuit problem. The simulation system is designed using modular structure so as to be convenient to test new algorithms. The proposed approaches in the dissertation are also verified by experiments. |