Reinforcement Learning-based Optimal Control Methods With Applications To Mobile Robots

Posted on:2015-05-13

Degree:Master

Type:Thesis

Country:China

Candidate:H Y Yang

Full Text:PDF

GTID:2348330509460660

Subject:Control Science and Engineering

Abstract/Summary:

PDF Full Text Request

Different from supervised and unsupervised learning, reinforcement learning is a kind of machine learning methods which can get the reinforcement signal by interacting with the outer environment and utilize value function or the estimation of policy to realize the optimization of sequential decision processes. Aiming at conquering the �curse of dimensionality� in large-scale state and action space problems, reinforcement learning methods with value function approximation have been commonly used to solve large-scale optimal control problems. Meanwhile, due to the rare dependency of the accurate dynamic model and the ability to optimize controllers, reinforcement learning has huge potential for mobile robots' path tracking control. Under the support of the Nature Science Foundation of China, the paper focuses on the research of reinforcement learning based on value function approximation and the manifolds. Moreover, by integrating with the classical control method, reinforcement learning is used to realize mobile robots' high-precision path tracking. What the paper contributes to are listed below:1. With the research of linear temporal difference learning with gradient correction(TDC), we try to integrate it with control algorithms. Two improved optimal control methods, which are improved Q-Learning and improved HDP algorithms, are proposed to extend the TDC algorithm from learning prediction problems to learning control problems. Because TDC is a proper stochastic gradient descent method, the convergence of the improved Q-learning method can be guaranteed when doing off-policy training. By testing on the mountain-car and the inverted pendulum systems, the experiment results validate the efficiency of the proposed methods. Moreover, the performances under different learning rate parameters are also tested and analyzed.2. In order to overcome the difficulties in choosing basis functions of approximators, a novel automatic basis function generating method is proposed and used in the critic network construction in DHP algorithm. After that, a framework of Dual Heuristic Programming based on Geodesic Laplacian Eigenmaps(GLEM-DHP) is given in the paper. By comparing with other DHP methods on nonlinear dynamic systems, the outstanding performance of the proposed method can be seen both from the simulation results and experimental results.3. A better way in selecting PID parameters when using PID control algorithm has been discussed. By utilizing the learning ability in DHP, a new PID control algorithm with self-learning parameters is proposed to solve mobile robot's path tracking problems. The PID parameters can be generated and adjusted according to different reference paths and system states to decrease the total tracking error. The DHP-PID method is tested on three different kinds of paths and the tracking performances are all better than the PID algorithm. Moreover, we use the Mobile Sim platform to test the controller learned by DHP-PID algorithm on Pioneer3-AT wheeled mobile robot system and satisfactory simulation results are obtained.4. The successful applications on the Googol linear one-stage inverted pendulum system have emphasized the feasibility and efficiency of GLEM-DHP methods. Moreover, they lay the foundation of the practical engineering applications for Reinforcement Learning in the real world.

Keywords/Search Tags:

Reinforcement Learning, Value Function Approximation, Temporal Difference Learning, Manifolds, PID Control, Mobile Robots' Path Tracking Control

PDF Full Text Request

Related items

1	Research On Weight Update Method In Temporal Difference Algorithm
2	Research On Temporal Difference Algorithm Based On Kernel Function Approximation
3	Reinforcement Learning Of Motion Control Strategy In The Environment Of Multiple Mobile Robots
4	Application Of Radial Basic Function Networks And Instance Based Learning In Reinforcement Learning
5	Research On Online Reinforcement Learning Based On Sparse Representation
6	Research On Regularized Least Squares Policy Evaluation Algorithms In Reinforcement Learning
7	Research On The Reinforcement Learning Method And Its Application
8	Study Of Reinforcement Learning Algorithms Based On Value Function Approximation
9	Research On Path Planning Of Mobile Robot Based On Reinforcement Learning
10	Autonomous Learning And Navigation Control For Mobile Robots Based On Reinforcement Learning