Reinforcement Learning Algorithm Study Based On ESN

Posted on:2022-08-04

Degree:Master

Type:Thesis

Country:China

Candidate:C Liu

Full Text:PDF

GTID:2518306488492504

Subject:Software engineering

Abstract/Summary:

PDF Full Text Request

In recent years,the artificial intelligence field has developed by leaps and bounds.During this period,new technologies and new methods have emerged one after another.Among them,deep reinforcement learning,which takes advantage of neural network perception,has attracted more attention.In this article,we choose echo state network(ESN)as the research object,and conduct research on classic reinforcement algorithms based on ESN.This article mainly does the following aspects of work.First,optimize the ESN online learning algorithm which optimized by the traditional recursive least squares(RLS),and propose a new mini-batch based MRLS-ESN optimization algorithm.Then,combining the MRLS-ESN algorithm with the traditional policy control algorithms,that is Q-learning and Sarsa respectively,two new policy control algorithms ESNRLS-Q and ESNRLS-Sarsa are proposed.Finally,it briefly discusses the application of the RLS-ESN optimization algorithm in the Advantage actor-critic(A2C)algorithm.ESN are generally optimized by RLS.Although RLS has fast convergence,it only uses one sample per iteration,which makes ESN difficult to scale to large datasets.To tackle this problem,an ESN model for mini-batch sequences is presented,and two optimization algorithms of stochastic gradient descent and Adam are given.Then,a novel mini-batch RLS algorithm is proposed for improving the training efficiency of the ESN model.On this basis,to avoid overfitting during the training of ESN,an regularization method is suggested for the proposed algorithm.In addition,to make ESN more suitable for time-varying tasks,an adaptive method for the forgetting factor of the proposed algorithm is also introduced.Simulation results show that the proposed algorithm has faster processing speed and better convergence quality than the original RLS algorithm.ESN have advantages of simplicity,easy to use and high training efficiency.However,limited by the strong correlation among states of agent,ESN-based policy control algorithms are difficult to update the network parameters by RLS.To solve the problem,two new policy control algorithms,ESNRLS-Q and ESNRLS-Sarsa,are proposed.Firstly,the leaky integrator ESN and mini-batch method are used to train in order to reduce the correlation among training samples.Secondly,the RLS self-correlation matrix is updated by an average approximation method to suit for processing mini-batch sequences.Thirdly,the regularization method is applied to preventing overfitting.Besides,the Mellowmax method is adapted to calculate the target state-action values to improve the convergence performance of the algorithms.Theoretical analysis and simulation experiments show that the proposed algorithms not only have lower computational complexity,but also have better convergence performance.In the A2C algorithm,the optimization of the critic network parameters is very important.Aiming at the optimization problem of the critic network,we propose an A2C algorithm based on RLS-ESN.Firstly,use ESN to provide more useful information for the critic network training.Secondly,use the RLS algorithm to optimize the relevant parameters to accelerating the algorithm convergence.Finally,in the comparison with the traditional optimization algorithm based on gradient,it is verified the proposed algorithm was effective.

Keywords/Search Tags:

echo state networks, deep reinforcement learning, recursive least squares, mini-batch training, policy control algorithm, A2C Algorithm

PDF Full Text Request

Related items

1	Research On Off-policy Reinforcement Learning Algorithm
2	Recursive Least-squares Reinforcement Learning Based On An Improved Extreme Learning Machine
3	Research On Least-Squares Policy Iteration Algorithms
4	Study Of Robot Arm Control Based On Deep Reinforcement Learning
5	Research On Dynamic Node Selection In Camera Networks
6	Research On Agent Decision-making And Control Based On Deep Reinforcement Learning
7	Robotic Intelligent Grasping Control Technology Based On Deep Reinforcement Learning
8	Research On Adaptive Filter Theory And Its Applications In Echo Cancellation
9	Study On Online Blind Equalization Algorithm For Satellite Channel Based On Echo State Network
10	Stochastic Algorithms In Deep Reinforcement Learning