Font Size: a A A

Motion Imitation Learning And Execution For Robot Manipulators

Posted on:2019-05-02Degree:DoctorType:Dissertation
Country:ChinaCandidate:J HuFull Text:PDF
GTID:1318330545985710Subject:Control Science and Control Engineering
Abstract/Summary:PDF Full Text Request
Motion planning are one of the key technologies of robot systems.The expertise involved in motion planning set high barriers to robot users and limit the large scale application of robots.Learning from demonstrations provides a simple and intuitive approach for motion planning and reduce the cost of training and maintenance of robot systems greatly.However,the deployments of learning from demonstrations methods on robot systems need to solve several problems,including the human-robot correspondence problem,motion representation and modeling and new trajectory generation during the learning stage and motion control under unknown external force during the execution stage.In this context,this paper first analyzes the problems during the learning stage and proposes the complete frameworks and detailed methods of policy learning and reward learning.To deal with influence of unknown external force exerted on manipulator,this paper conduct in-depth study on the problem of external force estimation and propose solutions in robot dynamics modeling and the design of disturbance observer and controller.The main contributions are as follows.(1)A policy learning method applied to anthropomorphic manipulator is proposed.To solve the human-robot correspondence problem,a modified affine deformation based method is proposed to deform the demonstrated trajectory,which can improve the position tracking precision of end effector and elbow and maintain the affine invariance of human motion.After the trajectory deformation,a probabilistic learning method to model the trajectories of end effector and elbow is proposed,which can generate human-like motion and model the uncertainty of different demon-strations.Then a method combining dynamic movement primitives and the learned probabilistic model is used to generate new trajectory.To follow the newly generated trajectory,a sequential quadratic programming(SQP)based method is used to generate the corresponding joint trajectory.Besides,a Frechet distance based method is employed to initialize the SQP based method.The effectiveness of the proposed methods are verified through learning the racket swing movement on a humanoid robot system.The results show that the proposed method can generate human-like motion and generate joint trajectory with higher success rate and less time.(2)A multi-stage reward learning method is proposed.Due to the fact that the demonstra-tions are strongly inconsistent with optimality,a simultaneously segmentation and reward learning method is proposed.In the proposed method,sampling based inverse reinforcement learning algo-rithm is applied on a sliding window over the demonstrations to extract the features of these win-dowed trajectories.Then,difference between features of adjacent trajectory windows is used to get the initial cutting points.After that,the optimal cutting points are searched from the neighborhood of the initial cutting points through a dynamic programming based method.The advantages of the proposed method are that the learned reward functions can be applied in continuous domain without restricting the possible reward functions into limited numbers or simple forms.To generate new trajectory,a complete objective functional is constructed based on the learned multi-stage reward functions plus other constraints like obstacle avoidance and is optimized with furnctional gradient method.The effectiveness of the proposed methods are verified through a manipulation task in simulation and a water transporting task in real manipulator.The results show that the proposed method can obtain more accurate segmentation results and the learned reward functions encode the specific property of demonstrated trajectories.Furthermore,the generated trajectory can adapt to new environments while maintain the specific properties of each substage as demonstrations.(3)A method for estimating unknown external force exerted on a robot manipulator is pro-posed.The force estimation method is divided into two steps.The first step is to identify a robot dynamics model.A parametric model is derived first based on rigid body dynamic(RBD)theory.To improve the model accuracy,a nonparametric compensator trained with multilayer perception is added to compensate for errors of the RBD model.The result is a semi-parametric model that pro-vides better model accuracy and contains explicit physical meaning.The second step is to construct a force estimation observer.A novel estimation method called disturbance Kalman filter(DKF)is developed.DKF can take both manipulator's dynamics model and disturbance's dynamics model into account.The process model of DKF is constructed as a linear time-invariant system,which avoids the computation of system's Jacobian.To improve security during trajectory execution,the collision detection and reaction methods based on DKF are also proposed.Simulation and exper-imental results show that the proposed method can improve the accuracy of obtained dynamics model,provide robust and accurate estimation against uncertainty,detect collision effectively and generate safe response quickly.
Keywords/Search Tags:Robot motion planning, Robot motion control, Learning from demonstrations, Inverse reinforcement learning, External force estimation
PDF Full Text Request
Related items