Font Size: a A A

Research On Releasing Manipulation Based On Learning By Demonstration And Reinforcement Learning

Posted on:2019-02-06Degree:MasterType:Thesis
Country:ChinaCandidate:P GengFull Text:PDF
GTID:2428330566459305Subject:Control engineering
Abstract/Summary:PDF Full Text Request
Robot technology has developed rapidly,but the robot manipulation has always depended on traditional planning methods,without intelligence and autonomous decision-making ability,and cannot be competent for some complex manipulation tasks and situations that cannot accurately establish the environmental dynamics model.The robot learning method opens a new way for robot manipulation,and it is a very popular research direction to avoid direct environment modeling and acquire the ability to complete tasks through autonomous learning.In all kinds of robot manipulations,the release manipulation is a typical of these manipulations,which has the trajectory generation stage,trajectory reconstruction phase and trajectory optimization stage of most manipulations.This study makes further research on the application of learning from demonstration and reinforcement learning on release manipulation.Specifically,a four-degree-of-freedom WAM manipulator is used to push and release a cylinder to the target area.The hope is that this research will be used to promote the application of learning from demonstration and reinforcement learning on releasing manipulation task.First,we need to provide an initial solution using learning from demonstration.The motion of the robot can be roughly represented by a general dynamic model,then using this model to represent the trajectory.And by adjusting the parameters of the model for proper adjustment of the movement,to adapt to the environment and the target of a slight change.The dynamic movment primitive method is composed of a second-order spring damper system to adapt to the dynamic model of the environment and robot,and a fitting term is added to approximate the trajectory of the robot.Therefore,the complex dynamic model is transformed into a simpler model with fewer parameters.Thus the initial solution is provided for the robot learning manipulation.In this paper,combining with their practice,the traditional second-order virtual spring damper system was improved,a multi spring dampers system is proposed.This method decouples the profile information and the target information of the trajectory,and is time invariant.Help improve the flexibility of the learning method.After the initial solution is obtained,the reinforcement learning method is provided to optimize the trajectory.Policy improve by path integral method is applied to value the trajectory performed by dynamic movement primitive.Higher cost of a path corresponding to lower probability.This intuition is used to update the model parameters of dynamic movement primitive iteratively until the robot can complete the releasing manipulation task.
Keywords/Search Tags:Learning by demonstration, Reinforcement learning, Trajectory learning and optimizing, Releasing manipulation
PDF Full Text Request
Related items