Font Size: a A A

Walking Control Research Based On Q-Learning For Underactuated Biped Robot

Posted on:2014-01-16Degree:MasterType:Thesis
Country:ChinaCandidate:D Y LiuFull Text:PDF
GTID:2248330395992875Subject:Control theory and control engineering
Abstract/Summary:PDF Full Text Request
Passive Dynamic Walking (PDW) is an important branch of the biped walking research. The purpose of Passive Dynamic Walking research is to discover the essential characteristics of biped walking and take full advantage of the dynamics of the robot itself during walking. Due to the structural diversity of the robots, their dynamics are much different. It is not advisable to track other robot’s gait. The Q-learning can take full advantage of the dynamics of the biped robot, with the continuous interaction between the biped and the floor. The optimal policy could be learned by a series of trial and error in Q-learning. In this paper, a kind of Q-learning method based on neural network is adopted to control the five-link underactuated walking robot, and a stable and continuous dynamic walking is achieved.This paper contains the following main works:Firstly, a planar five-link four-drive model is used, and the Compliant Actuator is selected.Secondly, Q-learning control method based on RBF neural network is designed. A RBF Neural network is employed to compute the discrete Q value of the continue state. The eligibility trace is integrated into RBF network to manage the time reliability problem. In order to reduce avoid the dimension explosion, a kind of equivalent pose-energy inverted pendulum model is drawn to reduce the neural network input dimension. Meanwhile, a new ε-greedy is proposed to achieve the balance between "explore" and "exploit" of Q-learning, which tend to use the greedy strategy when steps increased. Simulation results indicate that the proposed method is effectively, the experience replay can improve the efficiency of Q-learning.Thirdly, in order to improve the learning efficiency of Q-learning, an idea of Experience Replay is proposed. Simulation results indicate that the Experience Replay technology can improve the efficiency of Q-learning.Fourthly, the ADAMS simulation platform is designed. An ADAMS virtual prototype is established to get more practical and more effective simulation results. Meanwhile, the ADAMS and MATLAB co-simulation platform is established. It shows that the platform can simplify operations and improve the efficiency.
Keywords/Search Tags:Biped locomotion, Q-learning, RBF Neural Networks, Experience replay, ADAMS simulation
PDF Full Text Request
Related items