Font Size: a A A

Bipedal Periodic Walking Control Research Based On Reinforcement Learning

Posted on:2020-03-23Degree:DoctorType:Dissertation
Country:ChinaCandidate:Y WuFull Text:PDF
GTID:1368330590953692Subject:Mechanical and electrical engineering
Abstract/Summary:PDF Full Text Request
Biped robots have gained research attention,due to the merits of anthropopathic shape and good walking ability in complex environments,while there is no need to change the human-centered surroundings.As a kind of rhythmic movements,periodic walking is one basic locomotion type of biped robot.However,present periodic walking controllers to enhance biped robots' ability to adapt to different environments have drawbacks,such as model-based,rigid surface ignoring surface material's effects on locomotion,lack of dynamic learning ability.Above limitations hinder the developments and applications of biped robots.Reinforcement learning(RL)could be used to describe the learning process of agent,via interacting with environment.RL technology have benefits such as model-free controller,adapting to different environments and dynamic learning process.For the periodic locomotion tasks,walking controllers based on reinforcement learning are proposed in this dissertation.Human-like adaptive periodic biped gaits could be gained under the proposed controllers.The main points of the dissertation could be concluded as:First,one kind of periodic walking controller to enhance the adaptivity of passive dynamic walking(PDW)is proposed.Under the rigid ground assumption,hybrid dynamics model of the biped is developed,as the bipedal locomotion simulator for training.The original value of passive dynamic walking is calculated via cell mapping method,to get a periodic gait.And stability analysis of the gait is conducted.Then walking controller based on deep Q network is designed.During the trails-and-errors of reinforcement learning,above stable PDW gait is used as training reference,which is helpful to maintain the natural gaits of PDW and to decrease the necessary episode number to 50.After the training process to accumulate experiences,the trained walking controller could be used for the planar biped robot.Results show that successful periodic locomotion could be gained under disturbed original value,on level ground,different slope and in varying slope circumstances.For instance,under disturbed original value,planar periodic locomotion with steady velocity as 0.7363m/s could be achieved under controller,after adjusting gaits.Second,one kind of periodic walking controller for biped robot on compliant ground is proposed in this dissertation.Under compliant ground assumption,robot-ground coupled dynamics model is developed as the bipedal locomotion simulator,to get more accurate results than the rigid ground assumption.The periodic PDW gait on compliant ground is derived in this dissertation.And the effects of the ground compliance and hip stiffness are investigated,to show the potential to adapt to ground varying compliance by modulating the body stiffness parameter.Periodic walking controller based on reinforcement learning is designed.By means of controlling the hip stiffness parameter of flexible biped robot,the adaptivity of PDW on compliant ground could be enhanced.Results show that the PDW gait used as reference trajectory for RL could be useful to decrease the episodes to 35.Besides,the periodic planar locomotion could be gained under the proposed walking controller,starting from disturbed original value,on level ground,different compliant slope ground and in varying compliance circumstances.For instance,planar periodic locomotion with steady velocity as 0.6845m/s could be achieved under controller for biped robot on level compliant ground.Third,one kind of walking controller is proposed to enhance the gait adaptivity for multijoints biped robot.For the multi-joint biped robot NAO's 3D locomotion task,inspired by the rhythmic movement,walking controller based on reinforcement learning is designed to implement controlled walking in complex environments.The gait planning based on central pattern generation(CPG)is conducted as: rhythmic signal is produced via the basic oscillator,then the rhythmic signal is mapped into joints' space via the CPG network.And the parameters of CPG network are optimized via particle swarm optimization(PSO)method,to implement straight walking on level ground as the baseline motion.The feedback controller based on reinforcement learning is designed,imitating the animals' reflex mechanisms.The modelfree feedback pathway is designed via reinforcement learning.Then the trained RL network is used as walking controller.Results of dynamics simulator V-REP and physical prototype both show the success of 3D periodic locomotion on level ground,and upslope environments.The walking controllers proposed in this dissertation,inspired by the animals' gait features and learning process,could be used to improve the biped robot's ability to learn to walk.Such model-free controllers,which do not rely on specific environment models,could be used to enhance the versatility of the biped robot to adapt to complex and varying environments,thanks to the generalization ability of the reinforcement learning.
Keywords/Search Tags:Biped robot, Reinforcement learning, Periodic walking control, Passive dynamic walking, Central pattern generator
PDF Full Text Request
Related items