Font Size: a A A

Self Learning Control Of Mechanical Arm Based On Reinforcement Learning

Posted on:2020-05-28Degree:MasterType:Thesis
Country:ChinaCandidate:H T WangFull Text:PDF
GTID:2428330590974232Subject:Control Engineering
Abstract/Summary:PDF Full Text Request
At present,Deep Reinforcement Learning(DRL)has become an important frontier direction in the field of artificial intelligence.The application based on DRL methods in various fields has made breakthroughs,especially in the field of manipulator control.Traditional manipulator control methods have strong dependence on the manipulator model and environment model.In the unknown,dynamic and unstructured scenarios,the manipulator can only operate in a preset way,which results in the corresponding reduction of the accuracy and calculation speed of the manipulator's action.In order to expand the application scenario of the manipulator,it is fundamental.Combining with the development trend of DRL method in recent years,DRL method is adopted to solve the motion control of manipulator.The main research content of this paper is how to make the manipulator with high-dimensional perception as input get the optimal control strategy by autonomous learning in a specific environment based on reinforcement learning method.The goal of this paper is to complete the task of capturing the object in a specific area by autonomous learning of the manipulator based on the real-time image captured by two cameras as the state.The main contents of this paper are as follows:In view of the high correlation between adjacent state images captured from a single perspective,which can not truly describe the current environment,the mapping errors from three-dimensional stereo images to two-dimensional planar images are reduced by using left and right dual perspectives.Aiming at the control problem of manipulator with continuous state and motion,the feedback function mechanism of control strategy of manipulator is designed considering three factors:time,distance and environmental robustness.When the environment of manipulator changes greatly,the designed return function is still applicable.In order to ensure the continuity and safety of the manipulator training in the experiment and prevent the collision between the manipulator and the objects around it and the environment,the safety guarantee mechanism of the manipulator is introduced.The storage mode of experience playback mechanism in DDPG(Deep Deterministic Policy Gradient)algorithm is changed to data-constrained storage mode,which can save storage space at the beginning of training and improve learning efficiency.In the application of PPO(Proximal Policy Optimization)algorithm,the method of estimating dominant function is changed from N-step back quotation method to GAE(General Advantage Estimation)method,which makes the model have a better balance between variance and deviation,and a more suitable estimation method can beobtained according to the specific application problems.This paper studies the manipulator control based on deterministic strategy and stochastic strategy,that is,the manipulator control based on DDPG algorithm and the manipulator control based on PPO algorithm.In the simulation environment Gazebo,the task of grasping the target object in a specific area by the manipulator based on DDPG algorithm and PPO algorithm is realized.The experimental results verify the validity of the two control strategies based on DDPG algorithm and PPO algorithm,and the feasibility and applicability of the PPO algorithm based on GAE method.
Keywords/Search Tags:mechanical arm, control strategy, deep reinforcement learning, deep deterministic policy gradient, proximal policy optimization
PDF Full Text Request
Related items