Font Size: a A A

Research On Single Agent Autonomous Driving Control Based On Deep Reinforcement Learning

Posted on:2022-01-27Degree:MasterType:Thesis
Country:ChinaCandidate:J ZhuFull Text:PDF
GTID:2492306551470194Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the rapid growth of car ownership,the followed traffic problems are becoming more and more serious.As a promising solution,autonomous driving is growing rapidly under the catalysis of urgent social needs and highly developed artificial intelligence technology.Recent years,reinforcement learning has performed well in different decision and control tasks and has been applied to the automatic driving control of vehicles.Reinforcement learning can interact with the environment through agents and learn the driving policy according to the feedback of the environment.And a novel reinforcement learning algorithm,Soft Actor-Critic(SAC),introduces the concept of maximum entropy,which makes the training more stable and the agent exploration ability stronger.It provides the possibility to realize a better continuous control method for automatic driving.Based on SAC,this thesis proposes two improved algorithms to improve the temperature coefficient hyperparameter adjustment and time efficiency respectively;and a specific training method is designed and implemented for the generalization performance and the coping ability for congested environment which are less concerned in reinforcement learning automatic driving research.At first,in the situation that SAC needs manual work to adjust the temperature coefficientα,to subsequently tune the policy entropy to suit the exploration in different environments,this thesis observes the participance of α in SAC’s policy optimization,hence supposes that αwould adjust itself for maximizing return,then proves it by an experiment,and proposes a fully automatic entropy adjustment SAC algorithm.In the experiment,the performance of the improved algorithm is close to the original SAC algorithm,while save the work to manually adjust α.Secondly,based on the internal relationship between policy function and value function in SAC,this thesis proposes to infer a state-action value function,hence to avoid to explicitly keep a state-action value function and read policy preference from state-action value function for policy optimization,so the function structure and algorithm steps of SAC are simplified,and a SAC algorithm without concrete state-action value function is proposed.Experimental results show that the improved algorithm can effectively reduce the training time without reducing the utilization of SAC samples.Finally,about the generalization performance and the coping ability for congested environment in reinforcement learning based automatic driving,this thesis designs and implements a reinforcement learning control method for automatic driving in multi-scenario congested environment.For agent learning environment,based on the open source research platform of autopilot,several driving environments are constructed,corresponding to four common road scenarios(including sharp loop,merge,intersections and roundabout),and all the environments are injected with interfering cars to simulate congested driving environments;In agent training strategy,except traditional single-scenario training,this thesis adds multi-scenario hybrid training method.In the implementation of automatic driving,four reinforcement learning algorithms including the two SAC improved algorithms proposed in this thesis are used to train the autopilot agent.The experimental results show that the performance of the our improved algorithms are more stable than that of the original SAC algorithm in single environment training,especially in the sharp loop road;after using the multi-scenario hybrid training method,the automatic driving agent improves the generalization ability to deal with the multi-scenario strange environment,and also takes into account the congestion handling ability to interact with other social vehicles,in which SAC algorithm has the best comprehensive performance.
Keywords/Search Tags:Reinforcement Learning, Autonomous Driving, Continuous Control, MultiScenario, Congested Environment
PDF Full Text Request
Related items