Research On Continual Reinforcement Learning Based On Hindsight And Progressive Expansion

Posted on:2022-03-26

Degree:Master

Type:Thesis

Country:China

Candidate:M X Du

Full Text:PDF

GTID:2518306569997549

Subject:Computer Science and Technology

Abstract/Summary:

PDF Full Text Request

With the introduction of Deep Learning,Reinforcement Learning has achieved breakthrough development.In recent years,Deep Reinforcement Learning,which combines Reinforcement Learning and Deep Learning,has gradually become one of the mainstream of Artificial Intelligence.Deep Reinforcement Learning is a Artificial General Intelligence paradigm,through which most problems can be formalized.But Deep Reinforcement Learning is in the exploratory research stage,so it cannot solve problems stably and effectively.On the one hand,Deep Mind's Alpha Go agent defeated humans in the Complete Information game Go.On the other hand,due to the partial knowledge and the uncertainty,the Incomplete Information Game has become the focus of research in the field of Machine Games.This dissertation aims to study the learning methods of Continural Reinforcement Learning agents under the environment of Incomplete Information Game.Respectively propose solutions to the two problems of sparse rewards and multi-subtasks in this environment.In order to solve the problem of sparse rewards,this dissertation proposes a Reinforcement Learning method for Direct Future Prediction based on supervised signal training.By using the abundant agent state changes in the Incomplete Information Game scene as the supervision signal instead of the reward signal in the traditional Reinforcement Learning,supervised learning regression training is performed on each prediction network,and the decision-making action is combined with the Goal-oriented Reinforcement Learning method.At the same time,the post-review method is used to shape an Off-policy review experience pool to solve the problem of uneven supervision signals in the scenario of Incomplete Information Game and improve the efficiency of the future value prediction algorithm.Aiming at the problem of multiple subtasks in an Incomplete Information Game environment,this dissertation mainly uses the inheritance relationship curriculum to learn each subtask step by step.For the catastrophic forgetting problem caused by the transfer of knowledge in the course of learning,the Progressive Neural Network of the Continural Learning framework is introduced to dynamically expand the Direct Future Prediction network structure,and the old knowledge is carried out on the basis of ensuring that the previously learned knowledge is not forgotten.Use and use the new network to learn new knowledge.Due to the independence of the discrete prediction networks of the Direct Future Prediction network,when facing tasks of different dimensions,new prediction networks are freely discarded or expanded to solve the problem of the inconsistency of the action dimensions of each task in a complex environment.This dissertation uses the Pommerman game in an Incomplete Information Game environment as the experimental test platform.First,it verifies the effectiveness of the baseline method by comparing it with various classic traditional Reinforcement Learning methods.The effectiveness of the methods proposed in Chapter 3 and Chapter 4 are verified through comparative ablation experiments.And through the game test with the NIPS participating agent,the performance improvement effect of the agent proposed in this dissertation is verified.

Keywords/Search Tags:

Incomplete Information Game, Reinforcement Learning, Continual Learning, Curriculum Learning

PDF Full Text Request

Related items

1	Research On Parallel Multi-task Reinforcement Learning Method For Incomplete Information Game
2	Incomplete Information Machine Game Based On CNN And MCTS
3	Research On Active Learning Algorithms In Continual Learning Framework
4	Research On Incomplete Information Game Based On Improved Proximal Policy Optimization Algorithm
5	Design And Implementation Of Incomplete Information Game Domain Question Answering System Based On Deep Learning
6	Research On Game Algorithm Of Imperfect Information 3D Video Game Based On Deep Reinforcement Learning
7	Research On Imperfect Information Machine Game Based On Deep Reinforcement Learning In 3D Game
8	Supervised Reinforcement Learning:methods And Applications
9	A Model Of Intelligent Arranging School Syllabus System Based On Reinforcement Learning
10	Reinforcement One-shot Active Learning Algorithms For Incomplete Supervision Data And Its Application