Walking Control Research Based On Q-Learning For Underactuated Biped Robot

Posted on:2014-01-16

Degree:Master

Type:Thesis

Country:China

Candidate:D Y Liu

Full Text:PDF

GTID:2248330395992875

Subject:Control theory and control engineering

Abstract/Summary:

PDF Full Text Request

Passive Dynamic Walking (PDW) is an important branch of the biped walking research. The purpose of Passive Dynamic Walking research is to discover the essential characteristics of biped walking and take full advantage of the dynamics of the robot itself during walking. Due to the structural diversity of the robots, their dynamics are much different. It is not advisable to track other robot’s gait. The Q-learning can take full advantage of the dynamics of the biped robot, with the continuous interaction between the biped and the floor. The optimal policy could be learned by a series of trial and error in Q-learning. In this paper, a kind of Q-learning method based on neural network is adopted to control the five-link underactuated walking robot, and a stable and continuous dynamic walking is achieved.This paper contains the following main works:Firstly, a planar five-link four-drive model is used, and the Compliant Actuator is selected.Secondly, Q-learning control method based on RBF neural network is designed. A RBF Neural network is employed to compute the discrete Q value of the continue state. The eligibility trace is integrated into RBF network to manage the time reliability problem. In order to reduce avoid the dimension explosion, a kind of equivalent pose-energy inverted pendulum model is drawn to reduce the neural network input dimension. Meanwhile, a new ε-greedy is proposed to achieve the balance between "explore" and "exploit" of Q-learning, which tend to use the greedy strategy when steps increased. Simulation results indicate that the proposed method is effectively, the experience replay can improve the efficiency of Q-learning.Thirdly, in order to improve the learning efficiency of Q-learning, an idea of Experience Replay is proposed. Simulation results indicate that the Experience Replay technology can improve the efficiency of Q-learning.Fourthly, the ADAMS simulation platform is designed. An ADAMS virtual prototype is established to get more practical and more effective simulation results. Meanwhile, the ADAMS and MATLAB co-simulation platform is established. It shows that the platform can simplify operations and improve the efficiency.

Keywords/Search Tags:

Biped locomotion, Q-learning, RBF Neural Networks, Experience replay, ADAMS simulation

PDF Full Text Request

Related items

1	Learning Biped Locomotion Based On Q-Learning And Neural Networks
2	Research On Experience Replay Method For Deep Reinforcement Learning
3	Research On Optimization Method Of Deep Reinforcement Learning Experience Replay
4	Research On Optimization Methods Of The Experience Replay Mechanism For Off-policy Reinforcement Learning
5	MODELING, CONTROL AND SIMULATION OF THREE-DIMENSIONAL ROBOTIC SYSTEMS WITH APPLICATIONS TO BIPED LOCOMOTION
6	Deep Reinforcement Learning With Experience Replay
7	Research On Gait Planning And Simulation Of Biped Robot Based On Pro/E And Adams
8	Research On The Locomotion Control Of Guitar-Playing Robot
9	Biped Locomotion Control In Indoor Environment
10	Research On Experience Replay In Deep Reinforcement Learning