Font Size: a A A

Research On Path Planning Based Onreinforcement Learning

Posted on:2020-10-23Degree:MasterType:Thesis
Country:ChinaCandidate:X X GuoFull Text:PDF
GTID:2370330602450664Subject:Engineering
Abstract/Summary:PDF Full Text Request
With the rapid development of science and technology,the exploration in the aerospace industry and unmanned driving has never stopped.The application of intelligent agent such as drone and unmanned vehicle has gradually expanded,which inevitably needs a higher level for its intelligent algorithms.When the agent completes the task of path planning in the dynamic environment and high-dimensional space,considering the constraints,there will be certain planning difficulties.Thus,the traditional algorithm in path planning should be further improved.In addition,intelligent agent should not rely on manual to complete the routine tasks such as obstacle avoidance,path planning and navigation in the future.It ought to independently complete the missions through interaction with the environment.For the realization of these tasks,the reinforcement learning algorithm has opened up a feasible technical road.And the reinforcement learning algorithm has been applied in difficult human-machine games and other fields which are difficult for humans to control.Therefore,this paper proposes iteration rapidly exploring random trees algorithm(IRRT)through improving the traditional path planning algorithm.At the same time,the improved path planning algorithm based on reinforcement learning is proposed,combined with the memory functions.The main contents of this paper include:(1)At the begin of the thesis,the traditional path planning algorithms are analyzed thoroughly and implemented.Then,the basic rapidly exploring random trees algorithm(RRT)is achieved.In order to reduce the randomness of the algorithm and solve the problem of encountering dynamic obstacles,the idea of iteration and the random probability factor are added to RRT,which is named IRRT algorithm.The random tree can be extended to the goal point with a certain probability when expanding outward in IRRT algorithm.On this basis,a three-dimensional map based on the octree model is constructed to complete the experiment.the algorithm will iterate and compare to the random tree where the optimal path.And the path will be storied.Finally,the results of experiments show that the improved IRRT algorithm can re-plan partially the path in the environment of the dynamic obstacles.The path of the IRRT algorithm is shorter than RRT algorithm in a dynamic environment.(2)In order to explore the path,thesis uses the grid method to construct different maps with using different colorful squares to simulate obstacles,agents and goal points.Combined with the classic Q-learning and Sarsa algorithm in reinforcement learning,experiments obtain separately graphs of the success rate,cumulative reward,arrow path,local exploration path and global exploration.These graphs are visualized.In order to speed up the convergence to make Q-learning and Sarsa have memory function,the thesis generates new algorithms Qlearning Memory Trace(QMT)and Sarsa Memory Trace(SMT).The experimental results are better and faster.Experiments Use the control variable method to change different parameters and compare the effect of SMT algorithm.It is found that changing the different parameters can control the SMT algorithm to complete the path planning task faster and better.(3)The thesis compares experiment of reinforcement learning on a single agent to the double agents.At the same time,introduce the new strategies to prevent collisions between double agents.When double agents complete the path planning task to explore the target,an agent is simulated as a dynamic obstacle to complete the experiment,which can ensure that the second agent do not collide the first agent.The experimental results show that it is effective to apply the SMT algorithm and strategies to the double agents.The reinforcement learning SMT algorithm can complete the simulation task of the double agents in path planning.
Keywords/Search Tags:Reinforcement learning, IRRT algorithm, Path planning, QMT algorithm, SMT algorithm
PDF Full Text Request
Related items