| With the continuous development of artificial intelligence technology,its application in the industrial field has become more and more extensive.Among them,the integration application of industrial robot and artificial intelligence has gradually aroused strong attention from academia and industry.In the practical application of industrial robot,path planning is a key issue.Traditional robot path planning methods are usually based on predefined motion trajectories or joint angles for planning,with generalization capability checking,which is difficult to perform flexible manufacturing tasks.Deep reinforcement learning’s mechanism of stochastic exploration and empirical exploitation of the environment provides some feasibility for planning strategies to generalize robot motion paths.However,existing deep reinforcement learning algorithms still have some problems when applied to robot path planning tasks,such as convergence difficulties,low exploration efficiency and poor strategy interpretation.Therefore,this paper addresses this problem and investigates the use of a hierarchical structure to reduce the dimensionality of the state action space to improve the performance of reinforcement learning applications in the path planning task of complex scenes of robots.The main work is as follows:(1)To overcome the low efficiency of environment exploration generated by the high action dimensionality of deep reinforcement learning in the redundant degree of freedom robot path planning task,a deep reinforcement learning framework based on hierarchical action execution,HDRL-A,is proposed.by introducing a hierarchical reinforcement learning approach,a complete linkage action of the robot is divided into multiple sub-action linkage fits from the physical level,where each sub-action is trained with an independent agent for reinforcement learning.In this way,the spatial dimension of the action of the agent in training is reduced as a whole,and its efficiency in exploring the environment as well as its ability to handle complex environments is increased.(2)In order to reduce the negative impact of high-dimensional environmental state information on action strategy training in the path planning task of complex scenes of robot,a deep reinforcement learning framework based on task-objective layering,HDRL-T,is proposed by combining the Option layering method.By decomposing the main task into subtasks with the lowest possible correlation,a decision making layer and multiple task execution layers are set separately to process different state information separately,we aim to improve the correlation between the state and action of solo agent in the sub-tasks in a targeted manner,speed up the training,and enhance the interpretability of training strategies.In addition,to verify the above work,a 1:1 replicated virtual twin platform is built based on the physical experiment platform.A series of comparison experiments are conducted in the simulation scene and the physical scene,and the results verify the feasibility,effectiveness and superiority of the proposed method in this paper. |