Font Size: a A A

On Improved Reinforcement Learning Algorithm And Its Application In Control Of Robotic Manipulators

Posted on:2021-01-10Degree:MasterType:Thesis
Country:ChinaCandidate:T YanFull Text:PDF
GTID:2428330623467245Subject:Control Science and Engineering
Abstract/Summary:PDF Full Text Request
Controlling a robotic manipulatior to accomplish a specific complex task in an uncertain environment has been a very challenging problem by now.Traditional control methods often rely heavily on accurate system models.However,the models often have high order,nonlinear,multivariable and strong coupling characteristics,and even are hard to access.Thus,it's difficult to design a control law to make the manipulator systems have good adaptabilities and certain autonomy to the ever-changing environment.As one of the powerful artificial intelligence technologies,reinforcement learning(RL)has the ability to conduct policy learning through autonomous interaction with the unknwon environment,which has attracted extensive attention from scholars at home and abroad,and has become a research hotspot in the field of robotics and control.This thesis will study the improved reinforcement learning algorithm and its application in control of robotic manipulators.There are still many problems and challenges in applying the existing RL algorithm directly to the manipulator control.First,most of the RL algorithms consider a discrete state and action space.However,for robotic systems with high-dimensional continuous state and action space,the curse of dimension is to appear.Secondly,the existing methods rely on high sample complexity,but in practice the cost of the interaction between robot and environment cannot be ignored.In addition,reward shaping is required for different control tasks,and it is often not easy to design a good and appropriate reward function.Finally,many RL algorithms are difficult to converge and even sensitive to hyperparameters.Based on the RL framework,this thesis discusses and studies the model-based and model-free methods respectively,proposing some improvements to related RL algorithms for different tasks.Then,simulation verifications are conducted to provide the possible solutions.The main work and results of this thesis are as follows:1)First,the development of reinforcement learning in recent years and its application in robotics are reviewed.Secondly,the mathematical description of the RL problem is introduced: the Markov decision process(MDP),and the two basic solving methods of MDP are given.Finally,the dynamical process of the manipulator system is analysed and dynamic model is given.2)In order to solve the optimal control problem in the unknown dynamic environment,this thesis proposes a reinforcement learning method based on local model by using differential dynamic programming(DDP).The method stabilizes the training process by constraining the new and old trajectories,and accelerates the convergence of the algorithm.Since the algorithm adopts a model-based learning method,only a few iterations of training can obtain a better control policy,which leads to a low sample complexity and high learning efficiency.At the same time,the convergence speed and stability of the algorithm are much better than the previous methods.3)To solve the complex tasks in high-dimensional state space with unknown environment,a maximum entropy-based off-Policy deep reinforcement learning algorithm is proposed.First,considering the traditional model-free deep RL algorithm which relies on the high sample complexity,a new experience replay method is introduced.Particularly,due to this method our algorithm can be also capable for the case where the reward function is sparse and binary.In addition,to tackle the poor stability of the algorithm,the maximum entropy framework is adopted.In this setting,the objective of policy update is not only to maximize the expected reward,but also to maximize the policy's entropy,which can better balance “exploration and exploitation” problem.The thesis validates the effectiveness of the proposed method by means of computer simulation.Finally,the conclusion and further work are presented.
Keywords/Search Tags:robotic control, deep learning, reinforcement learning, optimal control, Marcov decision problems
PDF Full Text Request
Related items