Font Size: a A A

Research And Implementation On Reinforcement Learning Algorithm Based On Gaussian Process

Posted on:2017-06-20Degree:MasterType:Thesis
Country:ChinaCandidate:F L WangFull Text:PDF
GTID:2428330566453452Subject:Control Science and Engineering
Abstract/Summary:PDF Full Text Request
Reinforcement learning(RL)is an important machine learning method,particularly in the area of Artificial Intelligence.Reinforcement learning methods are divided into two parts: model-based reinforcement learning algorithm and model-free reinforcement learningalgorithm.Andthe learning speed of the model-free reinforcement learning algorithm is very slow and learning accuracy is not high.Although the model-based reinforcement learning approaches improve the problem such as slow learning speed and low learning accuracy.However,it typically requires many interactions with the system to learn controllers,which limits the development and application of reinforcement learning.For these limitations,we introduce a reinforcement learning method based on Gaussian process,which is a model-based policy search reinforcement learning method.Using Gaussian process model for external environment modeling of reinforcement learning algorithm,Propose using moment matching combined with the linearization of the posterior GP mean function inference algorithms to predict the successor state and control strategies,then using the policysearch approach tofind the optimal policy,so as to effectively improve the limitations of typical reinforcement learning method.The specific research work is as follows:Firstly,apply the Gaussian process regression to modify the external environment model of reinforcement learning.we analysis the disadvantage of the model-free and model-based reinforcement learning in detail,and then study the model Gaussian process regression of the supervise learning,get a thorough understanding of its excellent properties,and exploit the Gaussian process regression to modify the external environment model of reinforcement learning,which can overcome the impact of the typical model-based reinforcement learning algorithm almost can't bear the effect of model error,and later lay a good foundation for solving the expected cost ofRL.Secondly,research using the expected cost to obtain the optimal strategy,which can achieve the learning goals.Then under the condition of external Gaussianenvironment model,we propose using moment matching combine with linearization of the posterior GP mean function approximate approach to predict the successor state and deduce the control strategies roughly,thus obtain the expected cost.After acquired the excepted cost function,we improve the policy based on the gradient of the excepted cost function,and then optimize the policy parameters by the conjugate gradient method or a quasi-Newton method.Eventually,we apply the control strategies to the system of RL,and then by the feedback of system and the ultimate return of strategies to update the external environment model,repeat the above steps,until the optimal policy to be learned to achieve learning goals.By adopting the moment matching combined with the linearization of the posterior GP mean function approximate approach to predict the mean and varianceof successor state successfully improves the learning speed and other issues of the typical reinforcement learning.Finally,we apply the learning algorithm to the triple inverted pendulum model to evaluate its performances and make correlation analysis in this paper.
Keywords/Search Tags:Reinforcement learning, Gaussian process regression, Expected cost, Policy search, Triple-link inverted pendulum
PDF Full Text Request
Related items