Theoretical Research On Multi-step Reinforcement Learning Algorithm

Posted on:2019-10-20

Degree:Master

Type:Thesis

Country:China

Candidate:R Yang

Full Text:PDF

GTID:2428330596967101

Subject:Applied Mathematics

Abstract/Summary:

PDF Full Text Request

Multi-step Reinforcement Learning is an important reinforcement learning re-search method,which unifies the characteristics of one-step reinforcement learning and Monte Carlo algorithm.In fact,multi-step learning can accelerate the convergence of the algorithm through eligibility trace,so the study of multi-step reinforcement learning has always been a hot topic in the research field of artificial intelligence and machine learning.Recently,a new algorithm called Q(?)has been presented to evalue value function in the theory of reinforcement learning algorithm(where ? is the degree of sampling).Q(?)is a new method between full-sampling and no-sampling.The estimation equa-tion of Q(?)algorithm is a convex combination of the estimation equation of sarsa and expected sarsa algorithm,which unifies the characteristics of these two algorithms.Experiment results show that the performance of Q(?)is better than that of these two algorithms.However,the original paper only tests the performance of Q(?)on exper-iments.In this thesis,we made a theoretical analysis of the convergence of the Q(?),and we also proved that Q(?)is convergent under certain conditions.The main content of this thesis is to summarize the common reinforcement learn-ing algorithms(such as Dynamic Programming,Monte Carlo,Temporal Difference,Multi-step Reinforcement Learning algorithms and so on),and to make a theoretical analysis of the convergence of these algorithms.

Keywords/Search Tags:

Reinforcement Learning, Value function estimate, Optimiza-tion, Temporal Difference

PDF Full Text Request

Related items

1	Research On Weight Update Method In Temporal Difference Algorithm
2	Research On Temporal Difference Algorithm Based On Kernel Function Approximation
3	Application Of Radial Basic Function Networks And Instance Based Learning In Reinforcement Learning
4	Research On Regularized Least Squares Policy Evaluation Algorithms In Reinforcement Learning
5	Research On Online Reinforcement Learning Based On Sparse Representation
6	Reinforcement Learning-based Optimal Control Methods With Applications To Mobile Robots
7	Research On Sockt Forecasting System Based On Reinforcement Learning
8	Research On Value Function Overestimation For Deep Q-Network
9	Reinforcement Learning And Its Applications In MAS-based Collaborative Conceptural Design
10	Research On Optimization Methods Of The Experience Replay Mechanism For Off-policy Reinforcement Learning