Research On AGV Path Planning Based On Cooperative Multi-agent Reinforcement Learnin

Posted on:2024-05-17

Degree:Master

Type:Thesis

Country:China

Candidate:D Y Liao

Full Text:PDF

GTID:2568307148962679

Subject:Electronic information

Abstract/Summary:

PDF Full Text Request

As the level of industrial automation increases,Automated Guided Vehicles(AGVs)have become one of the key pieces of equipment in the logistics industry,widely used for material and product handling.Rational planning of AGV travel routes can effectively improve the operational efficiency of materials.Considering the advantages of reinforcement learning’s autonomous learning,it can improve the efficiency of AGVs in complex dynamic environments.Therefore,this thesis proposes an AGV path planning method based on multi-agent reinforcement learning.The main work includes:Addressing the issue of the value decomposition algorithms QMIX and QTRAN,which cannot balance training speed and stability,this thesis proposes a multi-agent deep reinforcement learning algorithm called QTRAN Plus.The algorithm improves upon QTRAN by using a hybrid network to replace the sum of each agent’s Q-value networks in QTRAN,thereby enhancing the network’s approximation capability and optimization ability.A new loss function is proposed for training the hybrid network and all agent’s Q-value networks to improve convergence speed.Simulation validation and ablation experiment results show that QTRAN Plus outperforms other algorithms in robot cooperative handling tasks.In traditional tabular Q-learning,finding the action with the maximum Q-value requires traversing and comparing each Q-value action in the Q-table,which is computationally expensive.To address this issue,this thesis proposes a multi-agent reinforcement learning algorithm based on an improved tabular Q-learning called T2 Q.T2Q employs a centralized training-decentralized execution framework and reduces computational complexity by improving the traversal operation through storing the two highest Q-values for each state,thereby enhancing the efficiency of the algorithm in a research-oriented context.Theoretical analysis proves that T2 Q is superior to traditional tabular Q-learning in terms of computational complexity and convergence speed.Simulation experiments show that T2 Q achieves a 100% success rate converging to the optimal joint policy on both platforms.Finally,a multi-AGV warehouse simulation platform is designed and developed to verify the effectiveness of the proposed reinforcement learning methods in multi-AGV path planning problems.Simulation results show that,compared to QMIX and VDN algorithms,the proposed T2 Q and QTRAN Plus algorithms can converge to the optimal policy more quickly.Additionally,the learned policies are visualized using the logistics simulation software Flexsim,providing intuitive validation of the algorithms’ optimality.

Keywords/Search Tags:

automated guided vehicle, path planning, multi agent reinforcement learning, multi agent deep reinforcement learning

PDF Full Text Request

Related items

1	Research On Multi-Agent Path Planning Based On Deep Reinforcement Learning
2	Research On Multi-Agent Path Planning Based On Deep Reinforcement Learning
3	Research On Deep Reinforcement Learning Technology For Multi-agent Collaboration
4	Multi-agent Collaborative Path Planning Based On Deep Reinforcement Learnin
5	Local Attention-Cooperated Reinforcement Learning For Multi-agent Path Finding
6	AGV Road Network Design Based On Multi-agent Reinforcement Learnin
7	Research And Application Of Agents Obstacle Avoidance And Path Planning Based On Deep Reinforcement Learning
8	Research On Intelligent Path Planning Of Manipulator Based On Reinforcement Learning
9	Research On Multi-agent System Decision Algorithm Based On Deep Reinforcement Learning
10	Research On Key Technologies Of Multi-agent Cooperation Problems Based On Reinforcement Learning