| As a crucial branch of robotics,mobile robots have been widely applied in our daily lives,and the research on their path planning algorithms has become a hot topic among scholars both at home and abroad.However,most of the current path planning algorithms have a high dependence on environmental map information,and it is difficult to plan a collision-free path from the start to the end when faced with unknown environments.Therefore,combining with the theory of deep reinforcement learning,this paper proposes an end-to-end path planning algorithm with autonomous learning ability,which overcomes the problem of high dependence on map information in traditional path planning algorithms.The research contents of this paper are as follows:Firstly,an analysis and summary of the current state of research on path planning and deep reinforcement learning were conducted.Then,the deep reinforcement learning algorithm DQN is applied to the path planning task of two-dimensional raster map to verify the feasibility of combining the deep reinforcement learning algorithm with the path planning.Next,the Soft Actor-Critic(SAC)algorithm used in this paper is elaborated in detail.To address the shortcomings of the SAC algorithm in path planning tasks,an improved SAC algorithm(SAC-LSTM-PER)is proposed.There are three main improvements: first,a long short-term memory(LSTM)network with memory capacity is introduced,enabling the algorithm to make better decisions by combining previous and current states during the path planning process;second,a burn-in training mechanism is introduced to solve the problem of memory degradation caused by resetting hidden states in LSTM during training,improving the algorithm’s performance;third,a priority experience replay mechanism is combined to address the problem of low sampling efficiency,improving the algorithm’s convergence speed.Then,in order to apply the SAC-LSTM-PER algorithm to the path planning task of mobile robots,a mobile robot path planning algorithm framework based on SAC-LSTM-PER was built,and a Turtlebot3 mobile robot motion model was established.Combined with the nature of the path planning task,the state space,action space,reward function and the whole process of the path planning algorithm are designed.Finally,the effectiveness of the proposed algorithm is validated through simulation experiments.Three experimental scenarios,namely,obstacle-free,static obstacle,and dynamic obstacle scenarios,are constructed using the ROS platform and Gazebo software.After setting relevant experimental parameters,simulation experiments are carried out using SAC,SACLSTM and SAC-LSTM-PER algorithms.Experimental results show that in three experimental scenarios,compared with SAC and SAC-LSTM algorithm,SAC-LSTM-PER algorithm has faster convergence speed and higher success rate of path planning.In addition,to better test the performance of the three algorithms,we add a set of experiments in a complex dynamic scene.The results show that the path planning time of SAC-LSTM-PER algorithm is shorter,the planned path is shorter,and the number of reaching the target point is more. |