Font Size: a A A

Research On Collaborative Control Method Of Dual-Arm Robots Based On Deep Reinforcement Learning

Posted on:2022-05-12Degree:MasterType:Thesis
Country:ChinaCandidate:L Y LiuFull Text:PDF
GTID:2518306311457974Subject:Control Science and Engineering
Abstract/Summary:PDF Full Text Request
Since the first robot came out in the 1960s,the technology of robots has developed rapidly.Compared with a single-arm robot,the dual-arm cooperative robot has higher flexibility and stronger load capacity and plays an important role in the industry,service industry,medical treatment,and other fields.Traditional cooperative control strategies of dual-arm robots are mostly based on precise mathematical models based on the task,which have poor self-adaptability.The control effect of the manipulator will deteriorate or even fail to complete the task.In recent years,deep reinforcement learning has developed rapidly,which can realize the end-to-end control of high-dimensional original input to output without a mathematical model,and has achieved tremendous success in artificial intelligence games,robot control,and other aspects.The purpose of this thesis is to use deep reinforcement learning to design a coordinated control strategy for the dual-arm robot so that they can complete cooperative tasks.Firstly,this thesis expounds on the research background and significance of dual-arm robots and introduces the development of dual-arm robots at home and abroad and the research status of its cooperative control strategy.Secondly,the mathematical description of the reinforcement learning problem is explained,and the existing reinforcement learning algorithms are briefly introduced.Thirdly,a control strategy is designed for the coordinated control of dual-arm robot.Each arm of the dual-arm robot is assigned an agent,and the coordinated control of dual-arm robot is regarded as a sequential decision-making process of two agents in continuous motion space.The idea of"rewarding cooperation and punishing competition" of multi-agent deep deterministic policy gradient algorithm is used to train agents to complete cooperative tasks,and hindsight experience replay algorithm is used to solve the problem of the sparse reward of the manipulator.A dual agent deep deterministic policy gradient(DADDPG)algorithm is designed by combining the two algorithms.On the basis of MuJoCo(multi-joint dynamics with contact)physical engine,a simulation platform of the dual-arm robot is built with the physical robot as a prototype.Training the DADDPG algorithm in the simulation environment shows that the algorithm can control the dual-arm robot to avoid collision and complete simple dual-arm tasks.The cooperative grasping task of dual-arm robot is a multi-step sequential decision-making task,which requires continuous execution of multiple steps in a period of time,and any problem in any step will lead to the failure of the task.Aiming at the unsatisfactory performance of the cooperative grasping task,this thesis proposes two improved methods based on the DADDPG algorithm.One direction is to use imitation learning to provide expert data for reinforcement learning and add a module to learn from demonstrations for the DADDPG algorithm,so as to provide an exploration guide for reinforcement learning.At the same time,the behavior clone loss is taken as the auxiliary loss of the reinforcement learning algorithm.When the agent's strategy performance is better than the demonstration example,the influence of the demonstration example is discarded to optimize the control effect.The other direction is to fuse the target detection technology with the DADDPG algorithm,and take the target distribution area obtained by the target detection method as the prior experience,and designate the working area for the reinforcement learning algorithm,so as to reduce the initial exploration space of the algorithm and improve the exploration efficiency.The two improved algorithms are tested in the simulation environment,and the results show that the dual-arm robot can complete the cooperative grasping task,which verifies the effectiveness of the algorithms.
Keywords/Search Tags:Dual-Arm Robot, Deep Reinforcement Learning, Coordinated Manipulation, Demonstration, Object Identification
PDF Full Text Request
Related items