Reinforcement Learning Control Methods Based On Prior Knowledge Model:Studies And Implementation

Posted on:2020-02-07

Degree:Master

Type:Thesis

Country:China

Candidate:W X Wei

Full Text:PDF

GTID:2518306548493654

Subject:Computer Science and Technology

Abstract/Summary:

PDF Full Text Request

Since there is no prior knowledge in the training process,as the complexity of the task increases,the training time also increases,which limits the task scenario of the application of the reinforcement learning technology.Therefore,it is of great theoretical significance to provide training results of similar tasks as a priori knowledge models to new reinforcement learning tasks,to solve the problem of reinforcement learning dependence on a large number of training samples,and to improve the usability and universality of reinforcement learning algorithms.This paper carries out the following research and innovation:(1)propose a strategy migration method based on prior knowledge model;(2)propose a reinforcement learning algorithm framework based on prior knowledge model;(3)propose a task similarity calculation method based on dynamic model.The prior knowledge model is designed based on the training characteristics of model-based and model-free reinforcement learning algorithms.The prior knowledge model migrates the model-based control strategy to the neural network through imitation learning,and acts as the initial strategy of the model-free algorithm to optimize the strategy.For the same source and target task,this paper realizes the algorithm framework based on the prior knowledge model,integrating the advantages of the model-based and model-free reinforcement learning algorithms.The algorithm framework provides a robust method for the combined use of model-based and model-free algorithms,and achieves excellent performance in the intelligent body forward attitude learning task in the simulation environment.Compared with random policy initialization,the algorithm framework of this paper achieves up to 3 times the sample efficiency.For the scenarios of low-dimensional source tasks and high-dimensional target tasks,this paper further proposes a task similarity calculation method based on dynamic models.In the algorithm framework of this paper,the dynamic models of different tasks are implemented based on the neural network of the same hidden layer structure.By calculating the degree of difference in the hidden layer weight matrix,the degree of similarity between the source task and the dynamic model of the target task can be compared,and the task similarity between the two can be obtained.The task similarity can be used as the combined weight of the low-dimensional prior knowledge model,which ensures the reliability of knowledge transfer between similar tasks.Finally,in the forward attitude learning experiment,compared with random policy initialization,the algorithm framework of this paper achieves up to 2 times the sample efficiency.

Keywords/Search Tags:

Reinforcement Learning, Imitation Learning, Prior Knowledge Model

PDF Full Text Request

Related items

1	Research And Application Of Reinforcement Learning In Autonomous Driving Research On Automatic Driving Algorithm Based On Prior Knowledge In Multiple Driving Scenarios
2	Research On Reinforcement Learning Method For Game Manipulation Behavior Imitation
3	Supervised Reinforcement Learning:methods And Applications
4	Inverse Reinforcement Learning And Imitation Learning With Applications In Intelligent Robotics
5	Reinforcement Learning Agent Design Based On Deep Perception And Imitation Learning
6	Research On Machine Learning Algorithms Based On Planning Network Model
7	Research On Decision Distribution Modeling In Reinforcement Learning
8	Optimization For Generative Modeling And Its Applications In Imitation Learning
9	Study On Robot Imitation Learning Based On Reinforcement Learning
10	Deep Reinforcement Learning Algorithm Based On Model Control