Font Size: a A A

Research On Reinforcement Learning Based On Clustering Algorithm

Posted on:2021-01-11Degree:MasterType:Thesis
Country:ChinaCandidate:L P YuFull Text:PDF
GTID:2428330623968211Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Reinforcement learning belongs to the field of machine learning and is a key research direction,which tries to solve the sequential decision problem.It is mainly used in many interactive and decision-making problems,that the common methods of supervised learn-ing and unsupervised learning cannot deal with well.General speaking,reinforcement learning is a way of simulating human learning,which can get reward based on the execu-tion effect,and ultimately achieve the goal through constant interaction learning with the environment.However,there are still many problems to be solved in reinforcement learn-ing,such as “dimension disaster”,and hierarchical reinforcement learning can alleviate this problem to some extent by introducing abstract mechanism.But classical hierarchical reinforcement learning methods such as Option,HAM and MAXQ all need to construct hierarchies manually before learning tasks,and they are difficult to achieve the desired results when the prior knowledge is insufficient.To solve the above problem,an option discovery algorithm based on KMeans++ and successor representation is proposed.Firstly,we use KMeans++ algorithm to cluster the successor representation of state space and get subgoal set.Secondly,we can get options through latent learning without using external rewards,which not only locally makes the state of each class close to the cluster center gradually,but also makes each state gradually close to the maximum state of the successor representation value in general.Then,we use SMDP-Q-learning to solve environmental tasks and use Grid-1 and Grid-2 to verify the performance of the algorithm in the end.H-DQN is a hierarchical deep reinforcement learning algorithm based on spatiotem-poral abstraction and internal motivation,but h-DQN model needs to set intermediate goals manually that limits the application scenarios of h-DQN.Unified model free hier-archical deep reinforcement learning uses unsupervised method to automatically obtain subgoal set,but the candidate subgoals are not very good in a more complex environment,that leads to the inefficiency of agent and the effect is not very good.To solve this prob-lem,a hierarchical deep reinforcement learning method based on clustering algorithm is proposed.First of all,we use the latest transition experience set to combine the step-by-step anomaly detection algorithm and clustering algorithm to obtain the candidate subgoal set G automatically.Then we use the improved kWTA to build a controller neural network for each subgoal in G and pre-train them.Finally,we use meta-controller and controller framework to train the whole task.And experimental results show that the algorithm can also obtain an excellent result in a complex environment.
Keywords/Search Tags:clustering, step-by-step anomaly detection, successor representation, hierar-chical reinforcement learning, hierarchical deep reinforcement learning
PDF Full Text Request
Related items