Font Size: a A A

Population-based Hyper-parameter Adaptation For Deep Reinforcement Learning

Posted on:2020-02-18Degree:MasterType:Thesis
Country:ChinaCandidate:Y D ZhouFull Text:PDF
GTID:2370330572487259Subject:Information and Communication Engineering
Abstract/Summary:PDF Full Text Request
Machine game is the imitation and improvement of human intelligent behavior,is an ideal laboratory bed for AI technology,known as "the fruit fly of AI"has a broad ap-plication prospect.Deep reinforcement learning is an effective means and mainstream method for solving machine game problems.However,there are still some problems in deep reinforcement learning methods.Among them,the hyper-parameter setting prob-lem has important research value because it directly affects the efficiency of learning.The main work and research results of this paper are as follows:Firstly,systematic research on the deep reinforcement learning method for machine game.In view of the field baseline failure problem,this paper gives the latest baseline results based on the field standard experimental methods and evaluation methods.In addition,the experimental research and analysis of the mainstream deep reinforcement learning algorithm,including algorithm data efficiency,algorithm sampling efficiency,etc.,and some new phenomenon were discovered,and some new conclusions were drawn.Secondly,a population-based efficient online deep reinforcement learning hyper-parameter adaptation training method is proposed.Different from traditional supervised learning,deep reinforcement learning is a highly dynamic and non-stationary optimiza-tion process.The performance of deep reinforcement learning is sensitive to the choice of hyper-parameter configuration,such as learning rate,discount factor and step size.For deep reinforcement learning,the optimal state of hyper-parameters is adaptive ad?-justment based on the progression of the learning process,rather than using a fixed set of hyper-parameter configurations from start to end.In this paper,a population-based effi-cient online deep reinforcement learning hyper-parameter adaptation training method is proposed,which is an improved version of PBT.Inspired by genetic algorithms,recom-bination operations are introduced into the population to accelerate the convergence of the population to better temporary optimal hyper-parameter configurations.Through a series of experimental studies,the method described in this paper can further improve the performance of the model.Thirdly,a two-stage population-based hyper-parameter adaptation training method is proposed.Based on previous research,this paper proposes a hypothesis:in the early stages when the learning model has little knowledge of the environment,frequent hyper-parameter changes do not help the effective learning of the model,but use a reasonable set of fixed hyper-parameters.Learning will help the model get the necessary knowl-edge as quickly and consistently as possible.This paper argues that this is especially important for the early stages of reinforcement learning.In this paper,we first verify the proposed hypothesis through experiments,and on this basis,propose a two-stage population-based hyper-parameter adaptation training method.The experimental re-sults show that the proposed method can achieve significant performance improvement for the population-based hyper-parameter adaptation method.
Keywords/Search Tags:Machine Game, Deep Reinforcement Learning, Hyper-parameter Adap-tation, Population, Two-Stage
PDF Full Text Request
Related items