Font Size: a A A

Reinforcement Learning Research On RoboCup Soccer Keepaway

Posted on:2007-11-16Degree:MasterType:Thesis
Country:ChinaCandidate:J M FanFull Text:PDF
GTID:2178360182460582Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
RoboCup aims at providing a platform in which multi agents coordinate and make decision in real time. Because it is distributed, time-critical, dynamic and asynchronous, it has become a benchmark for researching in distributed Artificial Intelligence. Soccer Keepaway is a subtask in RoboCup, it is a benchmark for reinforcement learning, most of which can be verified in this task.Reinforcement Learning doesn't need to know about the environment, rather it obtain the knowledge and improve the acting policy by interacting with the environment. It can handle the noise and stochastic variance, delayed goals, and doesn't need to know the system dynamics. The large state space can be handled using function approximation or state condensation. It is also oriented towards making decisions relatively rapidly rather than relying on extensive deliberation or meta-reasoning. So there are more and more applications in the RoboCup simulation using reinforcement learning.In this paper, we analyze some reinforcement learning methods, which are Value-based reinforcement learning(VBRL), Policy-Gradient reinforcement learning and Actor-Critic reinforcement learning etc. Actor-critic methods have been discussed in detail. Also tile-coding linear function approximation is used in Actor-Critic to get features. Then advantages and disadvantages are given using Actor-Critic in soccer keepaway and comparisons are made with valued based reinforcement learning and policy gradient learning.At last, experiments are made to compare the policies learned by reinforcement learning and benchmark policies. The results show that the policies learned by reinforcement learning outperform the benchmark policies; Also the policy learned by Actor-Critic is better than that learned by Sarsa(λ), a value-based reinforcement method on the condition that the players have 360 view and the problem itself is not so large.
Keywords/Search Tags:Reinforcement Learning, MAS, Actor-Critic, RoboCup, Function Approximation
PDF Full Text Request
Related items