Font Size: a A A

Path Planning Of Patrol Robot Based On HPSO And Reinforcement Learning

Posted on:2020-05-23Degree:MasterType:Thesis
Country:ChinaCandidate:Y SongFull Text:PDF
GTID:2428330596995447Subject:Computer technology
Abstract/Summary:PDF Full Text Request
With the development of science and technology,robots are more and more widely used.Path planning is an important topic in the field of robotics research,especially the handling of obstacles in complex dynamic environment is an important issue of dynamic path planning.Particle swarm optimization(PSO)algorithm has the advantages of high precision and fast convergence in solving path planning problems.Q-learning algorithm has strong autonomous learning ability in dynamic environment.It can quickly obtain the optimal strategy by directly estimating the value function Q(s,a)of state-action pairs.In this paper,two algorithms are integrated to solve the path planning problem in complex dynamic environment.This paper presents an IHPSO-Q dynamic path planning algorithm which combines global static and local dynamic environments.Firstly,the concept of health degree is introduced into particle swarm optimization(PSO).Lazy particles in PSO are identified by the number of oscillations and stagnation of particles in the iteration process,so as to avoid the algorithm falling into local optimal solution.Then,the location and speed of lazy particles are updated by the guidance factor close to the optimal solution,so as to improve the overall health of PSO,accelerate the iteration speed of the algorithm and plan the global path.Secondly,the action set A of the direction of the robot is defined,and the state set S of the risk measurement between the robot and the obstacle is obtained by BOX discretization method.The continuous environmental information is discretized as the effective input information of the algorithm,and the Q-value matrix is designed to record the expected reward value between the peripheral state faced by the robot in the current environment and the next decision-making action.By adding backtracking method,the influence of action learning process of follow-up inspection robot can be quickly transmitted to the current state,the lag of Q value transmission is improved,and the updating rules are more reasonable and effective.Thirdly,in the inspection of the global planning path,the robot decomposes the movement of the inspection robot based on priority into two kinds of behaviors:guidance and obstacle avoidance.When there are no obstacles in the surrounding environment,the robot guides the global planning path.When obstacles are detected,theobstacle avoidance behavior takes precedence.The Q-value matrix is calculated in real time,and the reward value is determined according to the pre-defined rules,so the new one is selected and executed.Action strategy.Through the reinforcement learning of dynamic obstacle avoidance,the robot builds an optimal strategy model of dynamic path planning to support the intelligent inspection of the robot.Finally,the simulation experiment is carried out on MATLAB R2016 a experimental platform.The test results of eight CEC standard test functions show that the IHPSO-Q algorithm proposed in this paper improves the optimization time,convergence speed,anti-interference ability and the best performance index compared with the traditional HPSO algorithm.IHPSO-Q algorithm has fewer iterations than the traditional Q algorithm,and the convergence effect is better,and achieves the research goal.
Keywords/Search Tags:Reinforcement learnin, IHPSO-Q algorithm, Path planning, Dynamic obstacle avoidance
PDF Full Text Request
Related items