Font Size: a A A

Theoretical Research On Multi-step Reinforcement Learning Algorithm

Posted on:2019-10-20Degree:MasterType:Thesis
Country:ChinaCandidate:R YangFull Text:PDF
GTID:2428330596967101Subject:Applied Mathematics
Abstract/Summary:PDF Full Text Request
Multi-step Reinforcement Learning is an important reinforcement learning re-search method,which unifies the characteristics of one-step reinforcement learning and Monte Carlo algorithm.In fact,multi-step learning can accelerate the convergence of the algorithm through eligibility trace,so the study of multi-step reinforcement learning has always been a hot topic in the research field of artificial intelligence and machine learning.Recently,a new algorithm called Q(?)has been presented to evalue value function in the theory of reinforcement learning algorithm(where ? is the degree of sampling).Q(?)is a new method between full-sampling and no-sampling.The estimation equa-tion of Q(?)algorithm is a convex combination of the estimation equation of sarsa and expected sarsa algorithm,which unifies the characteristics of these two algorithms.Experiment results show that the performance of Q(?)is better than that of these two algorithms.However,the original paper only tests the performance of Q(?)on exper-iments.In this thesis,we made a theoretical analysis of the convergence of the Q(?),and we also proved that Q(?)is convergent under certain conditions.The main content of this thesis is to summarize the common reinforcement learn-ing algorithms(such as Dynamic Programming,Monte Carlo,Temporal Difference,Multi-step Reinforcement Learning algorithms and so on),and to make a theoretical analysis of the convergence of these algorithms.
Keywords/Search Tags:Reinforcement Learning, Value function estimate, Optimiza-tion, Temporal Difference
PDF Full Text Request
Related items