Font Size: a A A

Multi-Robot Collaborative Navigation Based On Deep Reinforcement Learning

Posted on:2020-04-19Degree:MasterType:Thesis
Country:ChinaCandidate:S Z ZhouFull Text:PDF
GTID:2428330572983004Subject:Control Science and Engineering
Abstract/Summary:PDF Full Text Request
Facing the task with complex scenes,bad weather and unknown terrain,intelligent robots can replace humans to complete high-risk tasks,such as drone reconnaissance,unmanned boat escort,robot search and rescue and so on.In the above tasks,compared with the single robot system,the multi-robot system has higher fault tolerance,better adaptability and efficient utilization of resources,and the overall cooperation compensates for the individual deficiency.The multi-robot collaborative formation navigation is the core ability to complete complex tasks.The structured environment is no longer stable because of multiple robots moving at the same time.The traditional formation algorithms,mostly model-based,work badly in complex environments.To adapt to multi-robot systems with real-time,dynamic,random nature and other characteristics,the robot must acquire the external environmental information through self-learning,accumulate experience,and constantly improve its own strategy.This paper mainly studies the cooperative navigation of multi-robots by using the deep rein-forcement learning algorithm.Using the original sensor data,without map information,the tasks can be coordinated in the complex environment without communication between the robots,and collision avoidance guarantees its own safety.Based on the Deep Deterministic Policy Gradien-t(DDPG)algorithm,this paper makes targeted improvements based on the characteristics of robot navigation tasks and multi-robot systems.The main contents and contributions are as follows:1.This paper takes the unmanned trolley with two-dimensional single-line laser sensor as the research object.Firstly,based on the DDPG algorithm,the experience priority playback mecha-nism is added to realize the importance sampling.In addition,this paper introduces PID controller and simple obstacle avoidance controller in the process of robot exploration.The problem that the original DDPG algorithm converges to local optimum under high-dimensional input environment is solved.The effectiveness of the algorithm is proved by simulation experiments and successfully migrated to the physical experiment platform.2.In the multi-robot navigation problem,this paper extends the single-robot DDPG algorithm to the multi-robot system to obtain the Parallel Deep Deterministic Policy Gradient(PDDPG)algo-rithm with independent reinforcement learning.We simultaneously load multiple robots in a single simulation environment,sharing experience pools and sharing strategies.The algorithm converges in the training environment due to the collaborative strategy of the task,but the performance in the test environment is greatly reduced.3.We use the group collaboration idea to propose the Group Collaborative Deep Deterministic Policy Gradient(GCDDPG)algorithm.When training the evaluation network,we input the com?bined state and action information of the whole multi-robot system to achieve centralized training.However,policy networks only needs to local observation information and target point informa-tion of the robot to realize the collaborative navigation task.The simulation results show that the proposed method has stronger generalization ability than PDPPG.4.The GCDDPG algorithm is used to complete the training of collaborative navigation.The formation constraints and speed direction constraints are taken as part of the reward function.Given the local target points,each robot uses local sensor data and target point information to achieve formation movement and obstacle avoidance without communication.
Keywords/Search Tags:Multi-robot, collaborative navigation, deep reinforcement learning
PDF Full Text Request
Related items