Research On Reinforcement Learning Methods Based On Fuzzy Approximation

Posted on:2015-02-20

Degree:Master

Type:Thesis

Country:China

Candidate:X Mu

Full Text:PDF

GTID:2268330428498399

Subject:Computer application technology

Abstract/Summary:

PDF Full Text Request

Reinforcement learning is a kind of machine learning methods that can be used forsolving Markov decision process problems. It interacts with the environment to get themaximum cumulative rewards. The current challenges and opportunities of reinforcementlearning are how to solve problems with large state or action spaces. For the shortcomingsof the current reinforcement learning methods based on fuzzy inference, this paper usesfuzzy inference as approximating methods on reinforcement learning methods, proposesseveral improved value function approximating methods that based on fuzzy inference andbasis function optimization.i. In allusion to the drawbacks that the classic Q-iteration algorithms based onLookup-table or function approximation converge slowly and are difficult to get acontinuous policy,this paper proposes an algorithm named DFR-Sarsa(λ) based ondouble-layer fuzzy reasoning and proves its convergence in theory. In this algorithm, thefirst reasoning layer uses fuzzy sets of state to compute continuous actions; the secondreasoning layer uses fuzzy sets of action to compute the components of Q-value. Then,these two fuzzy layers are combined to compute the Q-value function of continuous actionspaces.ii. For the slow convergence performance and poor robustness of fuzzy reinforcementlearning methods, this paper proposes an algorithm named IT2FI-Sarsa(λ) based on aninterval type-2fuzzy inference and proves its convergence in theory. In the fuzzy inferencesystem, the antecedent part uses a novelelliptical type-2membership function, which canmake the defuzzification hasa closed solution, to partition the continuous state spaces.After getting the Q-value function by type-2fuzzy inference, it updates the parameters ofthe consequent part with the gradient descent method. Experimental results show that IT2FI-Sarsa(λ)not only hasa nice convergence performance, but also is robust to noise.iii. The current basis functions are designed mainly based on inaccurate priorknowledge, which may cause poor generalization when linear function approximation isapplied on reinforcement learning. To overcome the above shortcomings, this paperproposes an adaptive basis Q-iteration algorithm named ABF-QI and proves itsconvergence in theory. The algorithm works in a top-down fashion to select the basisfunctions. Firstly, it computes the value function based on the initial basis functions;secondly, it chooses the basis functions that need to be refined according to the criteria ofperformance evaluating; lastly, it adjusts the number and shape of basis functions with atype of hierarchy method.

Keywords/Search Tags:

reinforcement learning, value function approximation, fuzzy inference, type-2fuzzy logic, basis function refinement

PDF Full Text Request

Related items

1	Research On The Reinforcement Learning Method And Its Application
2	Study Of Reinforcement Learning Algorithms Based On Value Function Approximation
3	Research On Basis Function Construction Methods In Reinforcement Learning
4	Research On Reninforcement Learning Network Algorithm With Self-adaptive Basis Function
5	Consequent-Oriented Fuzzy Reasoning And Its Application To Type-1and Type-2Fuzzy Logic Systems
6	Analysis And Research On Off-policy Algorithms In Reinforcement Learning
7	Research On Value Function Approximation Methods In Reinforcement Learning
8	Research On Nonparametric Value Function Approximation Reinforcement Learning
9	Researches On Reinforcement Learning Based On TileCoding Function Approximation
10	Automatic basis function construction for reinforcement learning and approximate dynamic programming