Research On Behavior Decision Making Of Self-driving Vehicle Based On Meta Reinforcement Learning

Posted on:2022-06-06

Degree:Master

Type:Thesis

Country:China

Candidate:H Li

Full Text:PDF

GTID:2492306509494524

Subject:Vehicle Engineering

Abstract/Summary:

PDF Full Text Request

As one of the important components of intelligent transportation systems,intelligent vehicles are a hot topic of current research in the field of vehicle engineering as they help alleviate traffic congestion and reduce traffic accidents.Among the various technologies equipped with intelligent vehicles,behavior decision making of driverless vehicles is one of the key technologies of autonomous driving,which plays a decisive role in the driving safety and overall vehicle performance of vehicles.Among the various behavior decision methods,the behavior decision method based on meta-reinforcement learning has the advantages of high learning efficiency and good robustness,and has important research value.The current metareinforcement learning algorithm for behavior decision making of unmanned vehicles requires the calculation of the second order derivatives of the loss function,which is computationally intensive.To address the above problems,this paper combines the Reptile first-order metareinforcement learning algorithm with the proximal policy optimization reinforcement learning algorithm,and proposes the Meta-PPO meta-reinforcement learning algorithm,and applies it to driverless vehicle behavior decision making.The specific research contents of this paper are as follows.(1)The Meta Proximal Policy Optimization meta-reinforcement learning algorithm is proposed.In this paper,we combine the Proximal Policy Optimization algorithm with the Reptile first-order meta-learning algorithm to propose the Meta-PPO meta-reinforcement learning algorithm.The innovation of this algorithm is that we combine the Reptile metalearning algorithm on top of the original PPO algorithm,using the Reptile meta-learning algorithm to find a good initial parameter for the model,thus reducing the time required for the model to learn a new task,while not involving the computation of second-order derivatives and reducing the computational overhead.(2)A Meta-PPO-based approach to driverless vehicle behavior decision making is investigated.Aiming at the behavior decision problem in the absence of other obstacles such as pedestrians and vehicles on the road,an unmanned driving decision method based on the MetaPPO algorithm is designed,which can directly output action outputs such as acceleration and deceleration based on the numerical inputs of sensors such as speed sensors and distance sensors,and perform end-to-end decision control of the unmanned vehicle behavior.Experimental results in the autonomous driving simulation platform show that the Meta-PPO-based decision making method converges better than the traditional PPO algorithm,and the vehicle can run a full course on the training track.In addition,the unmanned vehicle with the Meta-PPO algorithm was also able to complete a full lap on a test track with greater curvature and higher difficulty,with good generalization.(3)A reinforcement learning algorithm-based decision making method for unmanned behavior in a multi-vehicle environment is investigated.A multi-vehicle unmanned decisionmaking method based on the PPO reinforcement learning algorithm is proposed to address the decision-making problem in the presence of multiple unmanned vehicles on the road,where a centralised policy network is trained to make decisions on the behavior of all unmanned vehicles.However,this method cannot solve the problem of non-smooth environment due to multiple unmanned vehicles learning at the same time.This paper then proposes a multiintelligent body proximal policy optimization algorithm based on the proximal policy optimization algorithm,designs a multi-vehicle unmanned decision-making model based on multi-intelligent body proximal policy optimization,and verifies the effectiveness of the method through experiments in an autonomous driving simulation platform.This paper investigates the behavior decision problem in two different scenarios: singlevehicle and multi-vehicle environments.The Meta-PPO meta-reinforcement learning algorithm is proposed and a single-vehicle behavior decision model based on Meta-PPO is developed for the unmanned decision problem in single-vehicle environments.For the decision making problem in multi-vehicle environment,a multi-vehicle unmanned decision model based on proximal policy optimization algorithm is proposed and a multi-vehicle decision model based on multi-intelligent proximal policy optimization is developed.Finally,simulation experiments are carried out in the Torcs autonomous driving simulation platform to verify the effectiveness of the model.

Keywords/Search Tags:

Self-Driving Vehicle, Proximal Policy Optimization Algorithms, Meta Learning, Multi-Agent Reinforcement Learning, Behavior Decision Making

PDF Full Text Request

Related items

1	Research On Overtaking Decision Of Autonomous Vehicles Based On Deep Reinforcement Learning
2	Research On Decision-making Method Of Highway Autonomous Driving Based On Reinforcement Learning
3	Unmanned Surface Vehicle Motion Control Based On Reinforcement Learning
4	Research On Autonomous Decision-making Algorithm For Autonomous Vehicles Based On Meta Reinforcement Learning
5	Research On Driving Decision-making Algorithms Of Intelligent Vehicles For Structured Road Environment
6	Research On Bus Speed Control Strategy Based On Multi-Agent Reinforcement Learning
7	Research On Confronting Policy Generation Method Of Multi-Agent System Based On Reinforcement Learning
8	Research On Decision-Making Of Beyond-Visual-Range Air Combat Based On Multi-Agent Reinforcement Learning
9	Research On Learning Driven Behavior Modeling Methods For Decision Making Of Computer Generated Forces(CGFs)
10	Research On Autonomous Driving Behavior Decision-Making Based On Reinforcement Learning