Font Size: a A A

Study On Humanoid Action Imitation Of Robot Based On Deep Reinforcement Learning

Posted on:2023-12-20Degree:MasterType:Thesis
Country:ChinaCandidate:S HuangFull Text:PDF
GTID:2558307100975909Subject:Control Science and Engineering
Abstract/Summary:PDF Full Text Request
Humanoid action learning is a core issue of robotics.By performing humanoid actions,robots can directly replace humans to achieve complex tasks without adapting to environments.It can be said that humanoid action learning is the basis of robots’ broad application.At present,the main methods of learning action policy include deep reinforcement learning,imitation learning,etc.Although these methods can be successfully applied in a simulation environment,there are still many problems in the real environment,such as the need to interact with an environment,inaccurate imitation,and poor robustness.In this thesis,aiming at humanoid action learning and starting from the problems that may exist in the real-world application of standard reinforcement learning,three methods of humanoid action learning are studied,which are as follows:1.Offline Reinforcement Learning based on Anderson Acceleration for Robotic TasksTraditional reinforcement learning requires the robot to interact with an environment to learn a policy.However,robots are constrained by physical entities in practical applications,so they cannot perform many actual physical interactions.In addition,low data quality,slow convergence,and low sample efficiency also limit the application in real-world environments.To address these issues,this thesis proposes an offline reinforcement learning method with Anderson acceleration.The method learns effectively from previously collected demonstrations through batch constraint and conservative learning and then introduces an Anderson acceleration mechanism to speed up policy learning and achieve a better learning effect.2.Robust Imitation Learning from Human Behavior in the Presence of State PerturbationsTraditional reinforcement learning performs actions based on the observed current state of the environment,which may usually contain measurement errors or adversarial noises.If the observed state deviates from the actual state,it will mislead the robot to perform unexpected actions,resulting in unacceptable losses.In addition,it is difficult for the robot to accurately imitate the reference motion due to the dynamic model mismatch problem.To address these issues,this thesis proposes a robust humanoid action learning method with the residual force control mechanism.The method eliminates state disturbances by introducing regularization terms into robot policy and solves the dynamic mismatch problem by introducing a residual force control mechanism.Finally,it can accurately mimic humanoid actions.3.Generative Adversarial Imitation Learning from Human Behavior with Reward ShapingAlthough traditional reinforcement learning can make the robot learn some simple action policies,it still has the problems of low sample efficiency and difficulty manually defining the reward function for tasks,such as humanoid behavior imitation.To address these issues,this thesis proposes a generative adversarial imitation learning method to learn from human behavior.The method utilizes a proximal policy optimization algorithm as its generator and improves the performance by introducing a reward reshaping mechanism.Finally,it can accurately mimic humanoid actions.From the practical application perspective,this research work solves some problems in imitating and learning humanoid actions.Besides,our work aims to make robots learn expected humanoid actions,which has specific significance for promoting the intelligent process of robots.
Keywords/Search Tags:robot learning, adversarial imitation learning, deep reinforcement learning, demonstration data, humanoid action imitation
PDF Full Text Request
Related items