The Study Of Multi-Agent Reinforcement Learning Methods For Cooperative Team

Posted on:2006-09-29

Degree:Master

Type:Thesis

Country:China

Candidate:Y P Bao

Full Text:PDF

GTID:2189360185463336

Subject:Management Science and Engineering

Abstract/Summary:

PDF Full Text Request

Reinforcement learning (RL) has been the hotpot in the research of multi-Agent systems (MAS) and machine learning (ML), because it doesn't require the environment model. In fact, a reinforcement-learning Agent learns through its interaction with the environment. MAS is often applied into open,complex and dynamic enviroment, in which a single Agent is insufficient to solve the faced task, so that Agents must do their work cooperatively. In order to adapt to the environment's dynamic changes, Agents must have the learning capability as well. But the traditional single Agent learning theory cannot hold true in the case of MAS. So it is desiderated to put forward a new learning method, according to the cooperative character of MAS.In Artificial Intelligence field, Pursuit Game is often used to test learning algorithms, and for this problem, the thesis establishes two multi-Agent cooperative reinforcement learning methods (MACRL) for multi-Agents: Commitment- Conventions-based learning method (MACRL-CC) and Joint-Action-Priority-Sequence -based learning method (MACRL-JAPS).After introducing some basic concepts of Agent,MAS and multi-Agents learning, the thesis analyses the research actuality and the future developmental directions of RL and multi-Agent RL (MARL).furthermore, the theory and related learning algorithms of them are briefly introduced.On the basis of analyses of Pursuit Game, aimed at the individual action learner, the thesis extends the RL algorithm for single Agent,proposes the MACRL-CC algorithm.Finally,aimed at the joint action learner,a Team-Stochastic-Games-based(TSGs-based) framework for multi-Agents cooperative RL is defined. in order to solve the multi- equilibria problem in the stochastic games,a MACRL algorithm called MACRL-JAPS is proposed. These two learning methods have been justified by experiments.The main research achievements and innovations are the establishment of two MACRL methods for Pursuit Game, which are justified by experiments. The MACRL-CC analyses the speciality of the system's goal,then decomposes it,and, by using the commitment-and-conventions-based method ,the system achieves cooperative problem solving betweem Agents. With an eye to the large space of the state and action,the notions of"Generalization of State"and"Generalization of Action"are defined,which are used to cut short the state space.Since every Agent can get similar experiences while learing,a experiences sharing method is proposed to improve the efficiency of learning in this method.However, By using the TSGs-based framework,MACRL-JAPS is proposed.On the purpose of solving the multi-equilibria problem in Games,the notion of JAPS is brought in.this method can ensure the Agent to exactly predict the actions of others,and then to achieve the same optimal equilibrium selection.

Keywords/Search Tags:

Multi-Agent Systems, Reinforcement Learning, Pursuit Game, Commitment and Conventions, Cooperative Games, Team Stochastic Games, Nash Equilibrium, Joint Action Priority Sequence

PDF Full Text Request

Related items

1	Investment And Reinsurance Strategies With Stochastic Differential Games
2	Mean Field Games Model With Application To Coal Enterprise Of Sea Ports
3	Research On Joint Replenishment And Delivery Problem With Stochastic Demand Based On Non-cooperative Game Theory
4	Nonzero-sum Stochastic Differential Portfolio Games
5	Research On Strategic-form Endogenous Network Games And Evolution
6	Stochastic Differential Games In The Financial Market And Oil Market
7	Analysis On Virtual Enterprise’s Expected Revenue Sharing Based On Fuzzy Cooperative Game
8	Dynamic Alliance Of Corporate Earnings Allocation Strategies Based On Fuzzy Cooperative Games
9	Research On The Solutions Of Interval Cooperative Games And Its Expansion
10	Zero-sum Stochastic Differential Portfolio Games