Font Size: a A A

Reinforcement Learning Agent Design Based On Deep Perception And Imitation Learning

Posted on:2020-02-06Degree:MasterType:Thesis
Country:ChinaCandidate:S YuFull Text:PDF
GTID:2428330572967275Subject:Engineering
Abstract/Summary:PDF Full Text Request
Reinforcement learning algorithm,as an import branch of machine learning,is aimed at solving the complex sequential decision problem in the field of artificial intelligence.Reinforcement learning algorithm has achieved notable success in several fields including robotics,resource management,recommendations and games.Games,having advantages in sample efficiency,safety,has gradually become an essential research platform for reinforcement learning.Although great a success has been achieved in the board games(such as AlphaGo),the control of large video game is still a challenge of reinforcement learning.Large video games have more complex visual input and environmental transition dynamics,which provides a good simulation for the real world decision research problem.Large video game also provides a research platform for multi-agent coordination,transfer learning,which is an essential part of artificial general intelligence.In this thesis,we present an intelligent game control system based on deep environment perception and behavior imitation aided reinforcement learning for FIFA 18,a 3D soccer simulation video game.Our system achieved a 95.7%goal rate,which surpasses a human expert and other commonly used reinforcement learning algorithms.Our contributions are as follows:1.We proposed a separated deep environment perception and intelligent decision model which decouples the perception and decision of our system.The transfer learning on small object complex perception problem largely insured the performance of perception learning,lower the difficulty of further decision learning.2.We also proposed a behavioral imitation aided reinforcement learning algorithm.Using imitation learning to solve the reward delay problem and using reinforcement learning to learn the reward sensitive actions,this algorithm massively reduced the search space of reinforcement learning and improve the stability.3.We also presented several improvements for current reinforcement learning algorithm,such as knowledge distillation for Q value function accurate estimation,counter based exploration method,reward reshaping and redistributed prioritized experience replay,which further improves the performance and stability of our algorithm.
Keywords/Search Tags:Artificial Intelligence, Deep Learning, Reinforcement Learning, Imitation Learning
PDF Full Text Request
Related items