Font Size: a A A

Learning Biped Locomotion Based On Q-Learning And Neural Networks

Posted on:2013-02-19Degree:MasterType:Thesis
Country:ChinaCandidate:Z Q PengFull Text:PDF
GTID:2218330371957832Subject:Control theory and control engineering
Abstract/Summary:PDF Full Text Request
Biped locomotion has been thought as a nature character for biped robot by Passive Dynamic Walking (PDW) theory, and the energy efficiency could be improved by using the nature dynamics of biped robot. Because of different mechanical structure of robots, their dynamics are much different. So it is not advisable to track the other biped robots' or people's gait. The optimal policy is found by a series of trial and error in Q-Learning theory, and biped locomotion could be learnt by interaction between robot and floor. Then the nature dynamics of biped robot will be devoted to improve the energy efficiency of biped gait. For more deep research on biped gait control method, a dynamic simulation platform and a real biped robot are designed.1,Robot postures are transformed continuously until an impact occurs. In order to deal with the continuous state's learning problem, a Q-Learning controller based on BP Neural Networks is designed. Instead of Q table, a Multi-input and Multi-output BP Neural Network is employed to compute Q value for continuous state. In order to manage time reliability problem in Q-Learning and we integrate the eligibility trace algorithm with the gradient descent method for continuous state. To avoid dimension explosion, an inverted pendulum pose-energy model is built to reduce the dimension of the input state space. For the sake of balance between "explore" and "exploit" of Q-Learning, we use a newε-greedy method with a variable stochastic probability, which decreases with the increasing of the step number. Simulation results indicate that the proposed method is effective.2,To simplify operation and improve the efficiency of simulation, a biped robot simulation platform is developed. ADAMS is applied to build a parametric model library, including two links model, three links model, four links model, five links model and seven links model. Then customized menus and graphic user interfaces (GUI) are developed for loading models, initiation, modifying parameters and showing simulation result. By the interface module ADAMS/Controls, it is easy to co-simulate with ADAMS and MATLAB. With the co-simulate platform, the heavy works of manual modeling is avoided and simulation efficiency is improved.3,Based on PDW theory, we design a 2D quasi-PDW biped robot, which has 8 degrees of freedom (DOF). There is a latch mechanism on knee, and the support leg could be upstanding. The virtual flexible actuator and DC servo motor are used for ankle and hip. The control system of biped robot is a classic hierarchy control system, and CAN bus is used for quickly communication. A GUI is designed for initiating, real-time control, data acquisition, saving data, recovery processing and so on. The biped robot will be a simple, easy to use and high precision platform.
Keywords/Search Tags:Biped locomotion, Q-Learning, Eligibility trace, BP Neural Networks, Simulation platform, Quasi-PDW biped robot
PDF Full Text Request
Related items