| As a common industrial automation equipment,manipulator has always been a hotspot in related research fields about its obstacle avoidance planning algorithm.In recent years,with the continuous complexity of the industrial environment,dynamic obstacles often appear in the environment,which puts forward higher requirements for the obstacle avoidance planning of the manipulator.In this paper,based on the deep reinforcement learning algorithm Soft ActorCritic(SAC),a dynamic obstacle avoidance planning strategy for manipulators in dynamic environments is proposed to better solve the dynamic obstacle avoidance planning problems of space manipulators in environments with moving obstacles.,which not only overcomes the low efficiency of the traditional obstacle avoidance planning method for continuous action space and high dependence on the accurate model of the environment,but also has the advantages of fast decision-making and high stability compared with the general reinforcement learning algorithm.The main research of this paper is as follows:First,the space manipulator is modeled,and the related kinematics models are analyzed,the D-H parameter table of the space manipulator is given,and the forward and inverse kinematics are deduced.The collision detection of irregular objects is relatively complex and inefficient.An efficient collision detection algorithm suitable for space manipulators is proposed.The algorithm is based on the structural model of the directional envelope body(OBB).It can reduce the computational complexity of the algorithm and improve the computational efficiency and real-time performance.Traditional algorithms need an accurate model of the environment and the control efficiency of continuous action space is low.,proposed a dynamic obstacle avoidance planning method for manipulators based on the deep reinforcement learning method Soft Actor-Critic,and combined with the priority experience replay mechanism(PER)idea,improved the SAC algorithm,making the algorithm more efficient in learning and faster in convergence speed,and through the reinforcement learning development platform Open AI Gym,the improved SAC algorithm was tested on a trolley and pole to verify the effectiveness of the algorithm sex.Then,the framework of the dynamic obstacle avoidance planning method of the manipulator is built,and the whole process of the model is designed in the framework,including the design of the reward function,the design of the state space and the behavior space,etc.Finally,three sets of experiments are designed and carried out.The manipulator control experiments verify that the joint angle is used as the output,and the improved SAC algorithm in this paper can be used for the motion control of the sevenaxis manipulator Baxter.The experimental results show that the improved SAC algorithm It is feasible for manipulator control;the dynamic obstacle avoidance planning experiment of manipulator in dynamic environment verifies the feasibility of the method proposed in this paper.Algorithm comparison experiments verify that the improved SAC algorithm in this paper is superior to other reinforcement learning algorithms in terms of the success rate of planning and the length and smoothness of the planned path. |