Font Size: a A A

Research Of Path Planning Algorithm Based On Reinforcement Learning

Posted on:2021-03-24Degree:MasterType:Thesis
Country:ChinaCandidate:T LiFull Text:PDF
GTID:2428330629952693Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Mobile robot technology has a broad application prospect.Its realization requires the cooperation of multidisciplinary professional knowledge.Among them,path planning is the key to realizing mobile robot technology.Path planning of mobile robot is to avoid obstacles and plan a path from the initial position to the target position in an unknown environment.In this process,mobile robot needs to explore the environment and finds the destination independently.Reinforcement learning algorithm imitates human's learning mode and enables mobile robot to learn autonomously.By trial and error,mobile robot corrects the current motion mode repeatedly according to the feedback information of the environment until it finds the best way to complete the task.Therefore,the main research method of this paper is reinforcement learning and the main research direction is path planning in an unknown environment.Mobile robot masters self-learning ability and adaptive ability by applying reinforcement learning algorithm to path planning problem.However,there are still some problems in practical application.The first problem is exploration-exploitation dilemma.When mobile robot makes an action decision,it faces two choices.One is to explore the environment so as to collect more environmental information.The other is to exploit the environment in order to make a choice which is beneficial to reach the target location rely on known knowledge.The biggest difficulty in solving exploration-exploitation dilemma problem is how to allocate the probability of exploration and exploitation reasonably.The second problem is how to design a reward function which can effectively feedback the environmental information and provide the correct guidance information for mobile robot.These problems affect the convergence of the algorithm.If the algorithm doesn't converge,mobile robot cannot obtain the optimal path.In order to accelerate the convergence efficiency of the algorithm,we propose an adaptive exploration method and optimize the reward function.A Q-learning algorithm based on adaptive exploration is also proposed by combining the above two improvements.The main research work is summarized as follows:(1)An adaptive exploration method based on ?-greedy algorithm is proposed to solve the exploration-exploitation dilemma problem in action selection strategy.It divides the training process of agent into three stages.According to the different requirements of three stages,the adaptive exploration method dynamically adjusts the exploration factors and allocates the probability of exploration and exploitation reasonably.What's more,it improves the exploration efficiency,reduces the exploration time and accelerates the algorithm convergence.(2)The reward function is optimized because the original reward function is too simple in reinforcement learning.The optimized reward function classifies the state-action pairs of agents and refines the reward rules.And the optimized reward function increases the feedback environment information,provides more guidance information to mobile robot and improves the learning efficiency and the convergence of the algorithm.(3)Simulation experiments are carried out in three different experimental scenarios in order to verify the feasibility of the algorithm.Experimental results show the proposed algorithm can find the optimal path successfully.Moreover,the comparative experiments between Q-learning,SARSA and the proposed algorithm prove that the proposed algorithm has better path planning performance,spends the least computation time and has the fastest convergence speed.
Keywords/Search Tags:Path Planning, Reinforcement Learning, Q-learning, Obstacle Avoidance
PDF Full Text Request
Related items