Font Size: a A A

On Dynamic Scheduling Method Based On Averaged Reinforcement Learning Algorithm

Posted on:2007-12-04Degree:MasterType:Thesis
Country:ChinaCandidate:C F SongFull Text:PDF
GTID:2178360212971381Subject:Control theory and control engineering
Abstract/Summary:PDF Full Text Request
The thesis mainly focuses on the dynamic scheduling method based on the averaged rewards reinforcement learning algorithms.Dynamic scheduling, a scheduling with incomplete information which has to change scheduling policies frequently by the influence of uncertainties contained in the environment, can be considered as the optimization course of policies. Dynamic scheduling, characteristic for dealing with stochastic real applications, usually takes multiple objectives into account under many constraints. First of all, the various methods dealing with dynamic scheduling problems are summarized in the thesis and classified into two types: the traditional methods and the intelligent methods based on operational research and AI respectively. As one kind of machine learning, reinforcement learning takes advantages of dynamic planning, stochastic approximation and function approximation. The agent defined in this method learns the mapping from the environment to actions in order to maximize the accumulated rewards. Compared with other scheduling methods, reinforcement learning, regarded as alternative ways solving scheduling problems in this thesis, is suitable for solving dynamic scheduling problems because of its strong mathematical background and few requirements for accurate environment model.Nowadays, there're many different algorithms for reinforcement learning. Each of them has relevant parameters, which affect the performance of the algorithms. It's necessary to do much research on these before applying reinforcement learning to applications. In order to achieve the goal, one typical environment named Grid-World is introduced and a set of visual software is developed by using the object-oriented technique with the tools of Visual C++ 6.0. The algorithm module of this software is programmed in form of Dynamic Link Library (DLL). The averaged rewards reinforcement learning algorithm called R-learning and the discounted algorithms called Q-learning and Sarsa-learning are programmed into such DLLs. A lot of experiments are made to test how the parameters affect the performance of these algorithms, and the differences between these algorithms are presented. Further study of reinforcement learning is based on the conclusions drawn from these simulations.In the last part of this thesis the averaged rewards reinforcement learning...
Keywords/Search Tags:Dynamic Scheduling, Averaged Rewards Reinforcement Learning, R-learning, Function Approximation and Elevator Group Scheduling
PDF Full Text Request
Related items