Font Size: a A A

Research On Reinforcement Learning Methods Based On Fuzzy Approximation

Posted on:2015-02-20Degree:MasterType:Thesis
Country:ChinaCandidate:X MuFull Text:PDF
GTID:2268330428498399Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Reinforcement learning is a kind of machine learning methods that can be used forsolving Markov decision process problems. It interacts with the environment to get themaximum cumulative rewards. The current challenges and opportunities of reinforcementlearning are how to solve problems with large state or action spaces. For the shortcomingsof the current reinforcement learning methods based on fuzzy inference, this paper usesfuzzy inference as approximating methods on reinforcement learning methods, proposesseveral improved value function approximating methods that based on fuzzy inference andbasis function optimization.i. In allusion to the drawbacks that the classic Q-iteration algorithms based onLookup-table or function approximation converge slowly and are difficult to get acontinuous policy,this paper proposes an algorithm named DFR-Sarsa(λ) based ondouble-layer fuzzy reasoning and proves its convergence in theory. In this algorithm, thefirst reasoning layer uses fuzzy sets of state to compute continuous actions; the secondreasoning layer uses fuzzy sets of action to compute the components of Q-value. Then,these two fuzzy layers are combined to compute the Q-value function of continuous actionspaces.ii. For the slow convergence performance and poor robustness of fuzzy reinforcementlearning methods, this paper proposes an algorithm named IT2FI-Sarsa(λ) based on aninterval type-2fuzzy inference and proves its convergence in theory. In the fuzzy inferencesystem, the antecedent part uses a novelelliptical type-2membership function, which canmake the defuzzification hasa closed solution, to partition the continuous state spaces.After getting the Q-value function by type-2fuzzy inference, it updates the parameters ofthe consequent part with the gradient descent method. Experimental results show that IT2FI-Sarsa(λ)not only hasa nice convergence performance, but also is robust to noise.iii. The current basis functions are designed mainly based on inaccurate priorknowledge, which may cause poor generalization when linear function approximation isapplied on reinforcement learning. To overcome the above shortcomings, this paperproposes an adaptive basis Q-iteration algorithm named ABF-QI and proves itsconvergence in theory. The algorithm works in a top-down fashion to select the basisfunctions. Firstly, it computes the value function based on the initial basis functions;secondly, it chooses the basis functions that need to be refined according to the criteria ofperformance evaluating; lastly, it adjusts the number and shape of basis functions with atype of hierarchy method.
Keywords/Search Tags:reinforcement learning, value function approximation, fuzzy inference, type-2fuzzy logic, basis function refinement
PDF Full Text Request
Related items