Font Size: a A A

Research On Path Planning Method Of Mobile Robot Based On Deep Q Learning

Posted on:2021-12-22Degree:MasterType:Thesis
Country:ChinaCandidate:C YangFull Text:PDF
GTID:2518306560453514Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
In recent years,with the continuous improvement of technology and human life needs,mobile robot path planning technology has become a hot research topic in the field of robot research.In path planning applications,the task of mobile robots is to simulate human autonomous perception in an unknown and unfamiliar environment to make motion decisions,so as to reach the destination safely and smoothly.In the unknown environment,the deep reinforcement learning algorithm has the problem of over estimation and insufficient training of important samples.This paper proposes an improved DDQN algorithm(Improved Double Deep Q-Network,IDDQN)to reduce the number of collisions of mobile robots with obstacles during training,improve the ability of path planning,and achieve successful obstacle avoidance and adaptability to different environments.The main research work is as follows:(1)In order to solve the overestimation problem of deep reinforcement learning in the application of mobile robot path planning,the Q-value update method in sarsa algorithm is introduced into the DDQN target value calculation to overcome the radical operation of Max in the target value,and then the loss function is calculated and the network parameters are updated by the improved target value,so that the value function estimated by the network is more in line with the true value;At the same time,the idea of average DQN is introduced into the ε-greedy strategy.The improved action selection strategy uses the average result of the output value function of the previous generation parameter network to determine the next action direction of the mobile robot,so as to select the optimal action in the current state and reduce the influence of over estimation on the action selection of the mobile robot.(2)In order to solve the problem of insufficient training of important samples in the process of path planning training,a priority replay mechanism based on sorting is proposed,which can increase the probability of important samples being replayed.Sufficient training is performed on important samples to improve the efficiency of network parameters and the learning efficiency of mobile robots.(3)In order to ensure the sufficiency of the simulation experiment,the performance of the algorithm is verified in simple environment,complex environment and random environment.In simple and complex environment,IDDQN algorithm can plan shorter length and fewer inflexion points compared with other algorithms.Compared with DDQN and DQN algorithm,IDDQN algorithm reduces the collision times of obstacles in the process of mobile robot training,which shows the advantages and effectiveness of IDDQN algorithm.The success rate and path length difference in the following random experimental results prove that IDDQN algorithm has stronger robustness.(4)In order to get closer to the real environment,the Turnlebot robot in the ROS system is modeled and simulated in the gazebo environment.The experimental results show the advantages of IDDQN algorithm in cumulative reward value and average reward value.Compared with other benchmark algorithms,IDDQN algorithm can get higher reward value and make better decision in the task of path planning of mobile robot.
Keywords/Search Tags:Deep Q network, mobile robot, path planning, unknown environment, overestimation
PDF Full Text Request
Related items