A Study Of Reinforcement Learning Based On Factor Representation

Posted on:2010-10-23

Degree:Master

Type:Thesis

Country:China

Candidate:S Dai

Full Text:PDF

GTID:2178360275984414

Subject:Computer application technology

Abstract/Summary:

PDF Full Text Request

Reinforcement Learning is an effective method to solve the plan problem in the stochastic environment,However,in the large state space,especial for application problems with complex stochastic states,the"dimension curse"problem hasn't been solved yet.At present,the factor Reinforcement Learning which develops from Reinforcement Learning in state space and action space,has been proven to be more effective to solve the large scale state stochastic control problem,and be applicable in the AGV navigation.Now for the almost all researches . Now all the research focus on the disposition of the state space before the Reinforcement Learning, and there has been relatively little research on the process of Reinforcement Learning . Based on this idea,the following aspects are investigated and disscussed.Firstly,the basic theory background and development of factor reinfocement learning are introduced.Four typical RL algorithms are discussed and compared.The empirical results are presented to show their differences and characteristics,which offer a basis to choose the right algorithm in the following work.Secondly,a new melthod for Dynamic Programming based on factored representation is mentioned in this paper.When we use DP melthod to solve complex RL problems,It's hard to compute the acuate value of Vπ.A new melthod based on linear approximately Vπis proposed to speed up the algorithm.In traditional RL,we always use look-up table to store the value of the Value function,but it has a problem,that is it has a high redundancy,A melthod decision tree is proposed,this method will be examed in the simulink experiment in this thesis.Finally,a new algorithm for TD(λ) based on factored representation is mentioned in this paper. The main principle of the algorithm is that states are factored represented, and making use of Dynamic Bayesian Networks(DBNs) to represent the conditional probability distributions in Markov decision processes(MDPs), together with decision-trees representation of Value function in the algorithm of TD(λ) to lower the state space exploration and computation complexity. Therefore the algorithm is a promise for solving large-scale MDPs problems which are of a huge state space. The validity of this representation is demonstrated by experiments.

Keywords/Search Tags:

Factored RL, Model Of Environment, DBN, Decision Tree, Algorithm Of TD

PDF Full Text Request

Related items

1	Research And Design Of Drug Procurement Recommendation Information Management System Based On Decision Tree Model
2	Reinforcement learning for factored Markov decision processes
3	Improvement Of C4.5 Decision Tree Algorithm
4	Research On The Book Acquisition Decision Model Of University Library Based On Decision Tree Algorithm
5	The Research Of Risk Decision For E Company Ssd Developing Based On Decision Tree Model
6	The Decision Tree Algorithm And Its Application On Employment Of Undergraduate Students
7	Research And Implementation Of Enterprise Sales Staff Recruitment Model Based On Decision Tree
8	Based On Decision Tree Con,-miner Data Mining Model Design And Implementation
9	Research On Field - Driven Human - Machine Interactive Decision Tree Model And Its Algorithm
10	The Research Of Decision Tree Method In Insurance