Font Size: a A A

Research On Agent Decision Issues Based On Markov Decision Theory

Posted on:2013-06-06Degree:MasterType:Thesis
Country:ChinaCandidate:J GuoFull Text:PDF
GTID:2248330371981073Subject:Control theory and control engineering
Abstract/Summary:PDF Full Text Request
Markov Decision Theory (MDT) is an effective tool for the research on the decision-making process of agent. Being the fundamental model, Markov Decision Processes (MDP) is often used to describe and solve the large-scale decision-making problems involving uncertainties. To deal with situations where only limited information from the environment is available to the agent, an extended version of MDP, called Partially Observable Markov Decision Processes (POMDP), was proposed in the literature. With the development of artificial intelligence, more and more researchers are focusing on the topic of decision-making problems of multiple agents, which is also called Multi-Agent System(MAS). Such research topic motivates the development of Decentralized Partially Observable Markov Decision Processes (DEC-POMDP).This thesis purports to1) introduce the above-mentioned three important models, MDP,POMDP and DEC-POMDP, contains in MDT;2) depict some algorithms with respect to these models;3)apply these models and associated algorithms to the development of strategies of the players(agents) in RoboCup according to the decision-making problems in RoboCup2D simulation competition. Specifically, this thesis consists of three major parts, as follow.In the first part, we firstly revealed the strategic deficiency when the players are in ball possession through analyzing the shortcoming of attack strategy during RoboCup2D simulation. After re-modeling the attack strategy when in ball possession, we obtained the best strategy by an iterative method based on value function decomposition. Experiments results showed that our model and associated algorithm can significantly enhance the attack performance of our team.The second part was devoted to improving the performance of the goalkeeper, whose decision must be made in real time given only incomplete information. As remarked before, under the circumstance of imperfect information, POMDP is most suitable for modeling the goalkeeper and perhaps can offer the best solution to block the opponents’ attack in urgent situation. Moreover, to ensure the goalkeeper is making a decision. Experiment results suggested that our method can greatly enhance the goalkeeper’s performance. In the third part we focused on decision-making problems involved in MAS. Although aiming at solving these problems, DEC-POMDP cannot be applied directly to the problems in RoboCup2D simulation, since its corresponding algorithms can only solve small-scale problems. To overcome this disadvantage, we not only analyzed the DEC-POMDP model and related algorithms, but also performed a series of tests on some benchmark problems of DEC-POMDP using MADP tool box. We found that the bottleneck is the offline solution part of DEC-POMDP. We then proposed a grouping limited space offline planning method to reduce the scale of problem during the solving procedure. Experiment result showed that the running time was shortened compared with the original version based on some benchmark problems in the MADP tool box.The work in this thesis was conducted on the platform of RoboCup2D. Several related models and algorithms were employed to tackle the decision-making problems of the players in RoboCup2D simulation competition. Experiment results verified the importance of our work. Based on the work in this thesis, our team, GDUT_TiJi, won the first prize in2011RoboCup China Open, and has already passed qualified for coming2012RoboCup World Championships to be held in Mexico in June,2012.
Keywords/Search Tags:RoboCup, decision-making, multi-agent system, MDP
PDF Full Text Request
Related items