Font Size: a A A

Research On Intelligent Path Planning And Collision Avoidance Algorithm Of Six-Dof Robot Arm

Posted on:2022-09-02Degree:MasterType:Thesis
Country:ChinaCandidate:J ZhongFull Text:PDF
GTID:2518306539469074Subject:Control Science and Engineering
Abstract/Summary:PDF Full Text Request
In recent years,the installed capacity of industrial manipulator in the manufacturing industry continues to grow,while the traditional manual teaching mode has begun to fall behind,because the diversification of production demand and the trend of customization put forward higher requirements for the intelligent level of industrial manipulator.The off-line path planning and on-line collision avoidance planning are the key technologies to improve the intelligent level of industrial manipulator.In this paper,the six Dof industrial manipulator HSR-BR606 is taken as the research object,aiming at its off-line path planning and online collision avoidance planning,the planning algorithm in high-dimensional joint space based on deep reinforcement learning training is studied,and the off-line path planning algorithm combining deep deterministic policy gradient algorithm and inverse kinematics algorithm is proposed.In addition,the basic deep reinforcement learning algorithm is improved in many directions and applied to the on-line collision avoidance planning of manipulator.The main research work and the results of this paper are as follows:(1)In this study,the forward kinematics mathematical model of six Dof industrial manipulator is established,and the pose representation of each link(including the end-effector)relative to the base is derived.Then,the Jacobian matrix of manipulator is constructed,and the numerical solution of the inverse kinematics problem of manipulator based on pseudo inverse Jacobian matrix is derived.In addition,the algorithm basis and important technology of deep reinforcement learning are summarized.(2)In view of the problem of long training time of deep reinforcement learning used in path planning of manipulator,the improvement is proposed from the perspective of exploration and exploitation.Because of the traditional deep reinforcement learning training using random exploration strategy,there are a lot of invalid exploration.In order to reduce the exploration scope of the manipulator in the working space to the effective space,the inverse kinematics module is introduced.The module is actually equivalent to giving the robot a prior velocity direction to reach the target point,which makes it explore the policy in the direction of the target point.In addition,time-varying gain module is introduced,which is to prevent the manipulator from falling into local optimal solution by using the prior velocity above too much.(3)For the online collision avoidance planning problem of manipulator,due to the higher complexity of the problem,the shortcomings of the deep deterministic policy gradient algorithm are improved as follow: first,for the over estimation problem of action value function,the improved method is to use double Q networks for selective fitting;second,in order to reduce the accumulation of the variance of action policy in the learning process,a policy network delay update method is used,and noise is added to the original action to smooth the action selection of the policy network;thirdly,in order to improve the sample efficiency of the training algorithm,the priority sampling mechanism is introduced to improve the training speed.(4)In this paper,simulation experiments are carried out for each proposed planning algorithm.The experiment is carried out in the robot simulation platform Coppeliasim.Through a few comparison experiments,the convergence performance of the proposed offline path planning algorithm is significantly improved,and the planning results are close to the existing optimal sampling planning algorithm,and the success rate is much higher than that of the sampling planning algorithm,reaching 100%.In addition,the improved on-line collision avoidance planning algorithm also successfully trains the collision avoidance policy model,and improves the convergence performance.
Keywords/Search Tags:Manipulator, Inverse kinematics, Deep reinforcement learning, Path planning, Collision avoidance algorithm
PDF Full Text Request
Related items