Font Size: a A A

Research On Laser Navigation AGV Control Method Based On Reinforcement Learning

Posted on:2021-05-21Degree:MasterType:Thesis
Country:ChinaCandidate:E M DuFull Text:PDF
GTID:2428330605482453Subject:Computer technology
Abstract/Summary:PDF Full Text Request
AGV(Automatic Guided Vehicle)as a flexible automated handling equipment has an important role in promoting the development of intelligent logistics and Industry 4.0.Among them,path following is the core technology of AGV to achieve high-precision control,and it is also the difficulty of AGV's promotion and application in many industrial fields.In order to solve the modeling difficulties caused by unknown parameters and avoid a lot of manual test work,this paper designs a path following control algorithm based on deep reinforcement learning for Laser-guiding AGV(LAGV),and introduces meta-learning The control algorithm discovers the learning law,so that when faced with different parameters LAGV and different paths,path following can be realized quickly.The main research work of this article is as follows:(1)Based on the improved deep reinforcement learning path following control algorithm design.Based on the construction of the LAGV path following system,the problem is transformed into a Markov decision process.Based on the actor critic framework to solve the continuity problem of the state space and action space in the path following MDP model,and uses an important sampling technique and a good exploratory near-end optimization strategy algorithm as the strategy gradient of the control algorithm,and finally combines the advantages of limited steps Estimate and Gaussian action output to implement LAGV's path following control algorithm.Experimental results show that,compared with PID and other reinforcement learning algorithms,the control algorithm in this paper can achieve more accurate and stable path tracking.(2)Path following control algorithm based on meta-reinforcement learning.In this study,combined with the Actor-Critic framework,a meta-state critic(Meta-SC)was designed as a meta-learning network,which was coded by the critic(Critic)network and a task actor with memory It is composed of Task Actor encoder Network(TAEN).The characteristics of the path following task are simultaneously stored in the TAEN and Critic networks by means of synchronous update,enabling Critic to estimate the state value function at the task level,so that Meta-SC can establish the core value network of LAGV path following,so that When the LAGV with large parameter fluctuations realizes the path following,the previous core value experience is used to guide the corresponding actor network to realize the fast learning of a small number of samples,so as to accelerate the training process of the algorithm.The experimental results show that the Meta-SC meta-learning algorithm in this paper can improve training efficiency to a greater extent than traditional reinforcement learning and Model-Agnostic Meta-Learning(MAML).
Keywords/Search Tags:Automatic Guided Vehicle, Path Following, Deep Reinforcement Learning, Meta Learning
PDF Full Text Request
Related items