Font Size: a A A

Research On End-to-End Control Based On Deep Reinforcement Learning For Autonomous Driving

Posted on:2022-09-18Degree:MasterType:Thesis
Country:ChinaCandidate:D LvFull Text:PDF
GTID:2518306314467414Subject:Vehicle Engineering
Abstract/Summary:PDF Full Text Request
Deep reinforcement learning(DRL)is composed of deep learning and reinforcement learning.Deep learning can process high-dimensional input,and reinforcement learning can make a decision properly.Both of these make DRL can realize an end-to-end system that maps perception to action directly.So DRL is very suitable to handle continuous tasks with frequent interactions and has achieved excellent results in recent research.DRL as a hot spot of artificial intelligence is recently causing more and more attention from both academia and industry.However,shortcomings still exist in the current end-to-end control method based on DRL,e.g.,the exploration efficiency is low,the training rate is slow,and the decision is always random.Hence,we study the control policy in autonomous driving and the control method of an unmanned vehicle based on deep deterministic policy gradient(DDPG).We make an improvement for the low efficiency of the exploration,the inefficient training,and the plenty of unexpected action.To evaluate the improvement,a simulator for autonomous driving and a physical system is proposed.The main contributions of this research are:1)A method of adaptive parameter space noise for exploration based on DDPG is proposed.For the inefficiency of exploration in DDPG,we analyze the effect on exploration after adding many kinds of noise,e.g.,parameter space noise and action space noise.We utilize the normalization of neural networks to uniform the disturbance of the different parameter space noise networks.Adaptive parameter space noise for exploration based on DDPG is added to increase exploration efficiency.2)The optimal method of autonomous driving policy is proposed.For the policy training based on DDPG for agents is long and inefficient,we study a control method of the unmanned vehicle.The training time is decreased by the final method of simple multi-subpolicy fusion.Several multi-subpolicy fusion methods are compared.Based on statistics,a method that is closest to expectation policy is proposed.3)An elimination method of abundant unexpected actions produced by DDPG is proposed.The agent actions are constrained by reward shaping and experience replay,which can reduce the unexpected risky actions,e.g.,extreme steering and frequent braking.
Keywords/Search Tags:Automatic vehicles, Deep reinforcement learning, End-to-end control, Intelligent control, Self-driving
PDF Full Text Request
Related items