Font Size: a A A

A Fast Adaptation Method For Agents Based On Meta-Learning And Deep Reinforcement Learning

Posted on:2022-10-24Degree:MasterType:Thesis
Country:ChinaCandidate:N X HuangFull Text:PDF
GTID:2518306317457904Subject:Master of Engineering
Abstract/Summary:PDF Full Text Request
With the development of the times and the advancement of technology,people's requirements for artificial intelligence not only stop at the ability to complete heavy scientific research and engineering calculations,but also hope that it can be comparable to human learning and complete more and more complex tasks.In recent years,with the deepening of deep reinforcement learning research,it is considered to be an important way towards general artificial intelligence.However,as of now,there is still a big gap between deep reinforcement learning and human learning.The specific performance is as follows:When encountering new tasks,humans can effectively use prior knowledge and only need a small number of samples to achieve better performance;while deep reinforcement learning encounters new tasks,it needs a large number of samples for training,and the algorithm has long convergence time,high sample complexity and poor stability.In order to make up for the shortcomings of deep reinforcement learning,meta reinforcement learning came into being.At present,meta reinforcement learning is manly used for on-line policy,and it is difficult to expand to off-line policy;new tasks and training tasks must be consistent.Based on this,this article proposes a rapid adaptation method for the agent based on deep reinforcement learning and meta-learning,so that agent can effectively use prior knowledge and achieve rapid adaptation in the new task in a way closer to human learning.The main work of this paper is as follows(1)In order to allow the agent to learn network initialization parameters using prior knowledge,this paper proposes an agent's rapid adaptation algorithm based on model-agnostic meta learning and deep reinforcement learning is proposed.This algorithm solves the problem that the existing meta reinforcement learning is difficult to expand to off-policy,the environment state of the new task and training tasks must be consistent.Specifically,the algorithm is divided into two parts,namely the meta-training process and the meta-testing process.First,the agent learns the network initialization parameters through two gradient descent methods in the meta-training process,and then fine-tunes the learned network initialization parameters in the meta-testing process,and finally completes the new task.The experimental results show that the deep reinforcement learning with the meta learning is better than the traditional deep reinforcement learning algorithm in terms of convergence speed and stability,and realizes the rapid adaptation of the agent to the new task.(2)In order to allow the agent to use prior knowledge to learn update,this paper proposes an agent's rapid adaptation algorithm based on long short-term memory network and deep reinforcement learning is proposed.Specifically,the gradient descent update method is optimized by using long short-term memory network.The simulation experimental results show that the algorithm with long short-term memory network has further improved in terms of convergence speed and stability,and realizes the rapid adaptation of the agent to the new task.(3)In order to allow the agent to use prior knowledge to learn the network framework,this paper proposes an agent's rapid adaptation algorithm based on meta-learning,neural network evolution and deep reinforcement learning is proposed.Specifically,let the agent uses neural network evolution and deep reinforcement learning to learn the network framework in the meta-training process,and then uses the network framework for fine-tuning in the new task,and finally completes the new task.The simulation experimental results show that compared with the existing algorithms,the algorithm has improved in terms of convergence speed and stability,and realizes the rapid adaptation of the agent in the new task.
Keywords/Search Tags:deep reinforcement learning, meta leaning, rapid adaptation
PDF Full Text Request
Related items