Research And Design Of Soccer Robot Decision System Based On SARSA Algorithm

Posted on:2018-04-29

Degree:Master

Type:Thesis

Country:China

Candidate:C Y Song

Full Text:PDF

GTID:2348330533969222

Subject:Computer Science and Technology

Abstract/Summary:

PDF Full Text Request

Robo Cup 2D simulation robot soccer game platform is a platform for the research of multi-agent robot system.The researchers can test different machine learning algorithms on the platform.Reinforcement learning is one of the most important algorithms among machine learning algorithms,which allows the agent to interact with the environment continuously to get the maximum cumulative reward.Under certain conditions,reinforcement learning can ensure that the learning of agents can converge to the optimal strategy.Reinforcement learning has been widely and successfully used in chess,backgammon,Tetris and other games.However it has not been fully studied in the Robo Cup 2D simulation game.This paper introduces SARSA algorithm into Robo Cup 2D simulation game and improves it.According to the defensive player of the position and the position of football,we complete the mapping of player agent of the state space.With the mapping's corresponding precondition function which is the select action basis of SARSA algorithm we achieve the design and implement of SARSA algorithm on Helios framework.Based on the knowledge of soccer field,this paper proposes two reward shaping functions,including separation-based reward shaping function and transfer distance based reward shaping function,so that the team has a better performance.In multi-agent systems,single agent often only can get the sparse Q table through the reinforcement learning,which can't represent the global situation of the whole system.In order to solve this problem,based on multi-agent shared Q table methods,we put forward Q integration algorithm,making the team get a higher winning percentage in the game.For the reason that reinforcement learning algorithm often cannot guarantee that Q table converges to the optimal strategy.This paper compares convergence of the adaptive ?-greedy action selection strategy and the fixed ?-greedy action selection strategy,and according to the experiment result we finally choose the adaptive ?-greedy.Then,this paper compares the effects of different reward functions on the goal scoring,an d determines the reward function.Meanwhile we contrast the influence of SARSA algorithm with two kinds of reward shaping function and verify the positive role of the methods on the team.In the end,a number of matches with the World Cup team are carried out,and the result is statistically analyzed,which verifies the validity of the algorithm.

Keywords/Search Tags:

reinforcement learning, SARSA algorithm, ?-greedy, reward shaping

PDF Full Text Request

Related items

1	Research And Application Of Deep Reinforcenment Learning Algorithms Based On Reward Shaping
2	Research On Clustering Algorithm And Its Application Based On Reinforcement Learning
3	Theory and application of reward shaping in reinforcement learning
4	Research On Reward Optimization In Reinforcement Learning
5	Cooperation Mechanism Of Simulation 2D Soccer Robot Based On Reinforcement Learning
6	Research On Deep Reinforcement Learning Algorithm Based On The Combination Of Intrinsic Reward And Auxiliary Tasks
7	Research And System Implementation Of Path Planning Based On Deep Reinforcement Learning
8	Research On Value Function Approximation Methods In Reinforcement Learning
9	Research On Dynamic Coding Characteristics Of Reward Prediction Error And Brain Inspired Q-learning Algorithm
10	Application Of Reinforcement Learning Based On Robot Soccer Simulation