Research On AGV Storage Path Planning Based On Reinforcement Learning

Posted on:2022-07-02

Degree:Master

Type:Thesis

Country:China

Candidate:H Liu

Full Text:PDF

GTID:2518306566990709

Subject:Control Engineering

Abstract/Summary:

PDF Full Text Request

With the rapid development of flexible manufacturing systems and intelligent warehousing systems,the application demand for Automated Guided Vehicles(AGV)is increasing.AGV is an automatic navigation device equipped with electromagnetic or optical sensors,which can load cargo and travel along a prescribed navigation path.In order to improve the efficiency of AGV operations and reduce transportation costs,this thesis proposes an AGV path planning method based on reinforcement learning for the storage environment.For single AGV path planning,a reinforcement learning-ant colony algorithm is proposed.Aiming at the shortcomings of ant colony algorithm in some situations,the idea of reinforcement learning is added to ant colony algorithm.The main contributions are as follows.First,the Q value is added into the state transition formula of ant colony algorithm to improve the selection probability of the optimal path node.Second,some poor nodes are punished to avoid ants choosing the poor path.Third,the path length and cumulative reward of each generation of ants are integrated into the comprehensive optimal local path,and the path pheromone is strengthened to improve the accuracy of the optimal path the ability to explore.The effectiveness of the proposed method is verified by comparing with the simulation results of the ant colony algorithm.For multi-AGV path planning,multi-agent reinforcement learning is used to optimize multi-AGV collaboration,and a cooperative task-oriented multi-agent reinforcement learning algorithm-WRFMR(Weighted Relative Frequency of obtaining the Maximal Reward)is proposed.The main contributions are as follows.First,the WRFMR algorithm requires each agent to estimate the Q function of their own actions,and does not need to estimate the Q function of joint actions,thus alleviating the problem of exponential growth of joint action space.Second,the algorithm uses weighted parameters and action probability to balance exploration and exploitation,so as to accelerate the convergence to the optimal joint action.The iterative method is used to estimate the frequency of obtaining the maximum reward,which reduces the space complexity of the algorithm.In this thesis,a mathematical model of learning process of WRFMR in cooperative repeated game is established,and the dynamic characteristics of the model are studied.The following conclusions are obtained.If the constituent actions of each optimal joint action are unique,then each optimal joint action is an asymptotically stable equilibrium point.In this thesis,we compare WRFMR algorithm with other MARL algorithms in two multi-agent cooperative tasks,box pushing task and DSN task,and verify the effectiveness of WRFMR algorithm.Finally,this thesis studies and builds a multi-AGV storage simulation environment.In this environment,the WRFMR algorithm is compared with other MARL algorithms.The results show that the WRFMR algorithm has a good performance in the multi-AGV storage system.In addition,this thesis uses Flexsim to visualize the learned strategy,further verifying the effectiveness of the WRFMR algorithm.

Keywords/Search Tags:

reinforcement learning, multi agent reinforcement learning, path planning, ant colony algorithm

PDF Full Text Request

Related items

1	Research On Intelligent Path Planning Of Manipulator Based On Reinforcement Learning
2	Research On Multi-AGVs Path Planning And Scheduling Technology Based On Reinforcement Learning
3	Decentralized Multi-agent Reinforcement Learning Algorithm Research
4	Research And Application Of Agents Obstacle Avoidance And Path Planning Based On Deep Reinforcement Learning
5	Multi Agent Path Planning And Formation Based On Hierarchical Reinforcement Learning
6	Supervised Reinforcement Learning:methods And Applications
7	Research On Deep Reinforcement Learning Technology For Multi-agent Collaboration
8	Study On Emergency Escape Route Planning Based On Reinforcement Learning
9	Path Planning For Mobile Platforms In Known Environments Based On Deep Reinforcement Learning
10	Research On Path Planning Based On Multi-Agent Cooperative Communication Learning