Font Size: a A A

Deep Reinforcement Learning Algorithm Based On Model Control

Posted on:2020-01-02Degree:MasterType:Thesis
Country:ChinaCandidate:D Y LiFull Text:PDF
GTID:2428330596482453Subject:Computer technology
Abstract/Summary:PDF Full Text Request
In recent years,Deep Reinforcement Learning has achieved remarkable results in various fields such as visual,voice,natural language processing,autonomous driving,drones,control robot weapons,games,etc.,and has caused extensive research in academia and industry.However,the current machine learning methods such as deep reinforcement learning face enormous challenges of theoretical obstacles,as pointed out by Judea Pearl,the winner of the Turing Award: the current machine learning system runs almost entirely in statistical or modelless mode,which causes serious theoretical limitations.In order to achieve human-level intelligence,the problem that needs to be solved is that it needs to be guided by the model.A large number of studies have shown that model-free reinforcement learning has high progressive performance but low learning efficiency,while model-based algorithms have high learning efficiency but low progressive performance.Therefore,the key to research is how to effectively combine model-free algorithm with model-free algorithm,so that it can ensure both the high progressive performance of model-free algorithm and the high learning efficiency of model-based algorithm.For decades,finding ways to combine model-based algorithms and model-free algorithms learning has been a hot topic of research.Representative research work includes synthetic experience generation techniques,partial model back propagation algorithms,and layered residuals based on model estimation residuals.However,the direct link between model-free and model-based reinforcement learning algorithms is still elusive.Therefore,it is necessary to carry out research combining model algorithms and model-free algorithms.This thesis focuses on the deep reinforcement learning problem combining model algorithm and model-free algorithm,and proposes the idea of deep reinforcement learning algorithm based on model control.The specific research content includes the following four aspects:First,this thesis introduces the Reward Shaping algorithm based on model control,which realizes the indirect guidance of the model to the model-free algorithm by reward value shaping.Secondly,this thesis introduces a model control-based Control Sharing strategy algorithm,which directly shares the behavior strategy of the model with the Bernoulli random variable probability to the agent,and then realizes the effective combination of the model and the model-free algorithm.Thirdly,this thesis introduces the Generative Adversarial Imitation Learning Algorithm,which uses expert samples as training data,and combines the model-based algorithm with the model-free algorithm by generative Adversarial network structure with the reward shaping mechanism.Fourthly,the Generative Adversarial Control Sharing Algorithm based on the control sharing strategy mechanism is proposed.The reward shaping guidance mechanism in the Generative Adversarial Imitation Learning is replaced with the control sharing mechanism,so that he combination is more efficient,especially in the field of autonomous driving.Finally,this thesis exercises and verifies the algorithm for the lane change decision problem in the automatic driving environment.The experimental results fully confirm the feasibility and effectiveness of the deep reinforcement learning algorithm based on model control.
Keywords/Search Tags:Model, Deep Reinforcement Learning, Autopilot, Expert Sample, Imitation Learning
PDF Full Text Request
Related items