| Mobile robots have a wide range of promising applications in many fields such as logistics and warehousing,rescue and disaster relief,and have received focused attention from researchers.The uncertainty brought by the unknown environment poses a great challenge to the autonomous exploration of mobile robots.How to efficiently realize the autonomous exploration of mobile robots in unknown environments has great research value and practical application value.Reinforcement learning algorithms have emerged in recent years to provide new ideas and solutions to the problem of autonomous exploration in unknown environments.Therefore,this paper addresses this problem and achieves effective results by investigating the reinforcement learning approach to autonomous exploration in unknown environments.In this paper,the reinforcement learning autonomous exploration algorithm TD3 is improved based on the global interest point design.Specifically,the mobile robot autonomous exploration algorithm in this paper can be divided into two major parts: global path planning and local path planning.In the global planning part,the adaptive setting of interest points is completed based on Li DAR information,and then the obtained interest points are used as intermediate nodes to continue the local path planning later.In the local path planning part,this paper improves the TD3 algorithm by using the structure of residual network to merge the input information and the mobile robot’s action before outputting to the evaluator(Critic)network,so that the information loss of the input information and the robot’s action is reduced to the minimum,the risk of gradient explosion or disappearance is reduced,and the efficiency and stability of the network are improved.The minimum distance from the obstacle is also added to the reward function,so that the network can evaluate the state more accurately,select the action that is as far away from the obstacle as possible,reduce the collision between the mobile robot and the obstacle,and improve the efficiency and success rate of autonomous exploration.In this paper,the autonomous exploration technique using reinforcement learning is trained and tested in a simulation environment and migrated to a real environment for testing.First,the improved TD3 algorithm combining points of interest is compared with the DDPG algorithm and TD3 algorithm in the simulation environment with and without pedestrians,and the experimental results demonstrate that the improved method performs better than the unimproved method in terms of success rate and collision rate of navigation,and can perform the autonomous exploration task efficiently in the environment with and without pedestrians.Then the autonomous exploration tasks in real scenarios with pedestrians in different movement modes and unknown environments without pedestrians are tested to verify the actual autonomous exploration effectiveness of the method in this paper. |