Research On Performance Of Actor-Critic-based Fusion Algorithm In Classical Control Problems

Posted on:2020-05-22

Degree:Master

Type:Thesis

Country:China

Candidate:D W Wang

Full Text:PDF

GTID:2518306104995509

Subject:Software engineering

Abstract/Summary:

PDF Full Text Request

Deep reinforcement learning has made great achievements in more and more fields,but in most cases,the model trained for a task does not perform well in the new task.The theory of meta learning points out that deep learning model can use prior knowledge to acquire the ability of fast learning in new tasks,and the combination of reinforcement learning is called meta reinforcement learning.Based on Actor-Critic,this paper first explores the performance of Double-Critic model constructed by action value network and state value network in the same task and other similar tasks,and analyzes the results.Then,the Meta-Critic model is constructed by combining the model with the task encoder,and a pre-trained Meta-Model is obtained by using different policy networks to train the model in different tasks.When a new task is given,the action value network in the meta model can be regarded as a prediction network.Before the agent makes a decision,the expectation of the next state value is calculated according to the action value provided by the prediction network and the current policy is updated according to the expectation,so as to explore the new task with the fastest speed and converge to the optimal policy.The objective value of loss function of action value network in the model is given by state value network.This method makes the update process of Meta-Critic model independent of the prediction of action by policy network,and further improves the stability of model adjustment process.Finally,this model and other algorithms are used to test in several new tasks to compare their performance in the new tasks.Finally,the experimental results show that the performance of the model is better in the new task,which shows that the Meta-Critic model has the ability to effectively guide the policy network in the new task by learning from the existing task.At the same time,it is expected that the model can combine the idea of off-line learning algorithm to make full use of the existing data,so as to make the pre-training process faster and more stable.

Keywords/Search Tags:

Reinforcement learning, Meta-learning, Prediction network, Prior knowledge

PDF Full Text Request

Related items

1	Algorithm Research On Knowledge Reuse And Generalization Ability Of Meta-learning
2	Reinforcement Learning Control Methods Based On Prior Knowledge Model:Studies And Implementation
3	Research On Knowledge Base Question Answer Based On Meta Learning
4	Using Task Prior In Reinforcement Learning Exploration
5	Research On Machine Learning Algorithms Based On Planning Network Model
6	User Mobile Pattern Mining Based On Meta Learning And Variational Generation Network
7	Controller Synthesis For Intelligent Systems Based On Meta-Reinforcement Learning
8	Research On Approaches To Complex Question Answering Over Knowledge Bases
9	Research And Application Of Reinforcement Learning In Autonomous Driving Research On Automatic Driving Algorithm Based On Prior Knowledge In Multiple Driving Scenarios
10	Research On Laser Navigation AGV Control Method Based On Reinforcement Learning