Font Size: a A A

Walking Control System Of 2D Biped Robot Using Reinforcement Learning

Posted on:2019-08-06Degree:MasterType:Thesis
Country:ChinaCandidate:J LengFull Text:PDF
GTID:2428330590465819Subject:Control Science and Engineering
Abstract/Summary:PDF Full Text Request
2D biped robot walk control is one of the key basic issues in robot research field.For complex structures of real specific biped robots,it is generally difficult to establish their accurate dynamic models.Therefore,the traditional model-based analysis method is difficult to achieve a good control effect.With the rise of intelligent algorithms in recent years,researchers have gradually introduced reinforcement learning algorithms in walking control of biped robots.However,the current reinforcement learning-based walking control systems either cannot completely separate from dynamic model of robots and require the reference gait of robots,or they can only handle the discrete state spaces and action spaces resulting in the inability to achieve accurate control.Therefore,this paper proposes a 2D biped dynamic walking learning method based on mean-asynchronous advantage actor-critic(M-A3C)algorithm.This method can directly deal with continuous space problems without reference gait.Based on the analysis of the dynamic walking process of 2D biped robot,this paper improves the original minimal walking model and proposes the simplest walking model with retractable knee joint based on impulse thrust and hip drive,and obtains a 1-cycle gait of this model.On the basis of this abstract walking model,a 2D biped robot with retractable knee joint and its virtual version in the physics simulation platform are designed to test the dynamic walking learning method proposed in this paper.The core of the dynamic walking learning method is the M-A3 C algorithm which is derived from the improved asynchronous advantage actor-critic(A3C)algorithm.The main contents of the dynamic walking learning method are as follows: the neural networks based on the M-A3 C algorithm are used,their input are robot state vectors and the outputs are joint-driven motion vectors;these neural networks are first trained on the virtual robots in physical simulation platform;after the training over,these neural networks are transferred to controlling the walking of a robot in a real physical environment.This method uses multiple virtual robots with the same physical parameters as the real robot to train,which reduces training costs and increases training speed.At the end of this paper,a 2D biped robot walking control system is realized by combining a 2D biped robot with a retractable knee joint and a dynamic walking learning method.The walking control of an actual robot is completed and a 2-cycle gait of the robot is produced by this control system.At the same time,the two M-A3 C implementation methods of the neural network structure with long short-term memory cells and the general full-connection neural network structure are analyzed comparatively through four groups of experiments,which verifies the feasibility of the dynamic walking learning method and superiority of the neural networks with long short-term memory cells.
Keywords/Search Tags:2D biped robot, walking control, reinforcement learning
PDF Full Text Request
Related items