Research On Reinforcement Learning Based On Gaussian Process Regression

Posted on:2015-03-07

Degree:Master

Type:Thesis

Country:China

Candidate:C Zhuang

Full Text:PDF

GTID:2268330428998522

Subject:Software engineering

Abstract/Summary:

PDF Full Text Request

Reinforcement learning is an important class of machine learning methods，particularly in the area of Artificial Intelligence. In the frameword of reinforcementlearning, the agent learns how to map the situations to actions through a long-terminteraction with the environment, so as to maximize the cumulative reward. Reinforcementlearning methods have been widely applied in the fields of games, elevator dispatching androbotics and so on.Facing the “curse of dimensionality” issue in large discrete state space or continuousspace, this paper proposes an algorithm based on value function approximation andcompares the performance of convergence of different algorithms. The main studies aregenerated as follows：(1) In general value function approximation method, the assumption of specificfunctional form is usually required. However, this paper uses the method of Gaussianprocess for regression, which does not need to claim the function relates to some specificmodels. It can represent the function obliquely, but rigorously, by the data themselves. Thismethod has the advantages of easy implementation and parameter self-adaptation. Also, ithas a good foundation of theory. The model of gaussian process for regression is also aform of supervised learning.(2) In the framework of Dyna, the convergence rate of the algorithm decreases whilethe size of the discrete state increases. In response to this problem, an optimized algorithmis proposed, which is based on Gaussian process regression and state clustering. Firstly, wereduce the scale of the discrete state space by clustering. Secondly, Gaussian processregression is used to evaluate the state-value of the high-scale discrete state space.Experiments can prove that the algorithm can effectively improve the convergence speed.(3) A value iteration algorithm based on Gaussian process regression in continuous-space named GPRV is proposed. In continuous space, value function can’t bestored as in discrete space. We combine the value iteration with the model of Gaussianprocess regession to implement the function approximation. The GPR-based algorithm canevaluate the value of state effectively, and show better convergence through the experimentof balance pole.

Keywords/Search Tags:

Reinforcement Learning, function approximation, Gaussian process regression, Dyna framework, GPRV

PDF Full Text Request

Related items

1	Research On Nonparametric Value Function Approximation Reinforcement Learning
2	Research On Value Function Approximation Methods In Reinforcement Learning
3	Researches On Reinforcement Learning Algorithm Based On Nonparametric Approximation
4	Research On Path Planning Of Mobile Robot Based On Reinforcement Learning
5	Research And Implementation On Reinforcement Learning Algorithm Based On Gaussian Process
6	Research On Efficient Approximation Methods For Gaussian Process Regression
7	Study Of Reinforcement Learning Algorithms Based On Value Function Approximation
8	Research On Reinforcement Learning In Continuous Spaces
9	Research On Data-efficient Reinforcement Learning Based On Local Gaussian Process Regression
10	Research On Reinforcement Learning Methods Based On Fuzzy Approximation