| Because of its convenient and efficient advantages,mobile robots have appeared in various aspects of people’s lives.With the development of computers,path planning has also received more research and attention.Now,path planning has become one of the core technologies in robot navigation and autopilot applications.When traditional path planning algorithms deal with complex environments,they have problems such as low search efficiency and easy to fall into local optimum.For the sake of solving these problems,this paper has studied the existing path planning algorithms,and has successively studied the RRT~* algorithm and the Q-learning algorithm.In view of their existing problems,such as the aimlessness of the RRT~*algorithm,the low efficiency of the search,the unsmooth path,and the slow convergence speed of the Q-learning algorithm,easy to fall into the local optimum,and unable to deal with the dynamic environment,it has made improvements,so that the robot can find the path more quickly and flexibly.The main work of this paper is as follows:(1)In order to solve the problems of aimless sampling,small number of effective nodes due to large sampling area,and low search efficiency of the progressive optimal fast expanding random tree algorithm(RRT~*),this paper proposes an improved RRT~* algorithm based on the guidance of artificial potential field combining with the advantages of artificial potential field in expansion strategy.First of all,the sampling method guided by artificial potential field is used to make the sampling more target-oriented;Secondly,in order to reduce redundant nodes on the path,the adaptive dynamic change strategy of neighborhood radius is used to control the search range of the parent node in the reselection stage;After finding a feasible path for the first time,select a point on the path to generate a sampling area as the ideal sampling area,and then sample in this area to limit the sampling area to reduce the number of sampling times in the process of finding the approximate optimal solution,thus improving the search efficiency;Using a path post processing method to reduce redundant nodes,a smooth path suitable for robot walking is obtained.Finally,a comparative experiment was conducted in a simulation environment.The results of synthesizing the other three algorithms show that the improved algorithm in this paper can search for feasible paths in a shorter time,and the path length is shorter than other algorithms.(2)In order to solve the problems of slow convergence speed,low search efficiency,easy to fall into local optimum and unable to deal with dynamic environment of Q-learning algorithm in path planning,a path planning algorithm based on double-layer Q-learning model is proposed.For the inherent tradeoff between exploration and utilization in reinforcement learning algorithm,it is proposed that the exploration greedy coefficient changes dynamically with the increase of learning times,this enables the algorithm to quickly adapt to changes in the environment,which not only accelerates the algorithm search speed but also reduces local optimization problems;Secondly,in the strategy of updating the Q table,increase the number of forward learning steps,so that the Q table can learn the information of the unknown environment faster,and accelerate the convergence speed;Finally,a two-level Q-learning learning model is established.The bottom model learns the global environment information,and the high-level model learns the local environment information,handles the dynamic obstacles,and realizes the path planning in the dynamic environment.The simulation results show that the improved algorithm has the characteristics of fast learning and adapting to the environment compared with the existing Q-learning algorithm and the other two improved algorithms.It can not only find the optimal route,but also has a higher convergence rate,and can more effectively design the route,which proves the rationality and efficiency of this technology. |