Font Size: a A A

Study On Robot Imitation Learning Based On Reinforcement Learning

Posted on:2020-06-09Degree:MasterType:Thesis
Country:ChinaCandidate:J H LuFull Text:PDF
GTID:2428330623456209Subject:Control Science and Engineering
Abstract/Summary:PDF Full Text Request
With the increasing demand of productive development,the application situations and scale of robots are increasing greatly,there is an increasing demand for the intelligent ability of robots.When the robot completes the task of motion planning,traditional methods need to accurately model the robot and its interactive environment.Although these methods can effectively complete the motion planning task,they have some problems such as sensitivity to physical models,poor generalization ability and poor real-time performance.In order to solve the above problems,this thesis combines imitation learning with robot motion planning,and proposes two kinds of robot imitation learning methods based on deep deterministic policy gradients(DDPG)when the reward function can be clearly given or not.Meanwhile,based on the above two methods and the human motion capture system,a robot imitation learning system is constructed.The main contents of this thesis are as follows:1.Study on robot imitation learning method with reward functionWhen the reward function can be given clearly,the main problem of robot imitation learning is the exploration problem and the reward shaping problem.In order to solve above problem,the method of robot imitation learning based on hindsight experience replay(HER)is proposed in this thesis.This method uses the demonstration data and HER mechanism to solve the problem of exploration and reward shaping,so that the robot can quickly complete the motion planning task under sparse reward.The experimental results show that this method can effectively utilize the sparse reward and obtain a high learning speed than the other methods even under the low success rate of demonstrations,and moreover the method can also effectively reduce the vibration of the robot and have good motion smoothness.2.Study on robot imitation learning method without reward functionWhen the reward function cannot be given clearly,there are some problems in traditional methods,such as large amount of calculation and slow learning speed.In order to solve above problems,this thesis proposes a deterministic generative adversarial imitation learning(DGAIL)method.This method combines DDPG with generative adversarial network(GAN)so that the robot can quickly imitate the demonstration policy.The experimental results show that this method can effectively complete the motion planning task without reward function,and the learning speed is less affected by the difficulty of the task.Meanwhile,this method has high stability and can complete the motion planning task in different state.3.Construction of robot imitation learning systemBased on the above two methods and the human motion capture system,a robot imitation learning system is constructed.The system uses the 6-DoF pose estimation method to detect the target object,and uses the above two methods to complete the robot imitation learning task for different task difficulty.Robot imitation learning is an important research direction of robots.The research work of this subject has important theoretical value and great practical value.It is helpful to design and develop more intelligent,autonomous and adaptive intelligent robots.
Keywords/Search Tags:robot learning, reinforcement learning, imitation learning, HER, DGAIL
PDF Full Text Request
Related items