Font Size: a A A

Research Of Reinforcement Learning Based On Clustering Analysis

Posted on:2021-04-24Degree:MasterType:Thesis
Country:ChinaCandidate:B LiFull Text:PDF
GTID:2428330620964286Subject:Engineering
Abstract/Summary:PDF Full Text Request
Reinforcement learning is a computational approach,which learns from interaction,is an important branch of machine learning.It is characterized by its simple structure and strong generalization ability.RL has shown great potential in areas such as intelligent decision-making,autopilot and robot control.The hierarchical reinforcement learning addressed the challenges of learning,planning,and representing knowledge at multiple levels of temporal abstraction by introducing the notion of options.Hierarchical structure of hierarchical reinforcement learning can be determined in advance by the designer using prior experience,or be obtained automatically.How to discover the hierarchical structure of HRL automatically and generate the subtask strategy are two problems to be solved in hierarchical reinforcement learning.Clustering is a representative method of unsupervised learning,which can effectively explore the internal structure of the data set,and has been widely used in the fields of pattern recognition,image segmentation and computer vision.Besides,clustering often plays an important role in data processing as a precursor of other machine learning tasks.Therefore,introducing clustering method into reinforcement learning has important research value.The main research work is as follows:First,in order to decompose a task into subtasks,this thesis proposes a clusteringbased algorithm which can discovery sub-goals automatically.This algorithm uses the successor representation to build the predictive model of the state space.On the basis of that,the state clustering method is used to find the key states in the well-connected regions which navigate the agent through an area and are defined as sub-goals.Compared with the traditional state clustering method,the proposed algorithm can obtain more reasonable sub-goals and perform better in the asymmetric environment.Different from graph-based approach,which focus on finding the bottleneck states,our method is a more flexible implementation.Secondly,on the issues of policy generation for subtasks,this thesis designs a new exploration bonus to help learn the intra-option policies through a latent learning process.To be specific,the intra-option policies are learnt using a pseudo reward function defined by the successor representations vector of the sub-goal states.At the same time,this thesis also proposes an incremental algorithm that iterates between estimating successor representation and building options to construct more robust intra-option policies.Thirdly,this thesis shows how to apply the options generated by clustering-based algorithm to reinforcement learning.This thesis uses the tabular methods as an example to demonstrate the effect of the options generation algorithm.The result shows that the options generated by clustering method are reasonably distributed in the space,which can navigate agent to explore the space and greatly improve the convergence rate of reinforcement learning.
Keywords/Search Tags:reinforcement learning, hierarchical reinforcement learning, successor representation, automatic hierarchy, state clustering
PDF Full Text Request
Related items