Font Size: a A A

Research On Unmanned Vehicle Control Method Based On Policy Gradient Reinforcement Learning

Posted on:2022-09-03Degree:MasterType:Thesis
Country:ChinaCandidate:X Y ZhangFull Text:PDF
GTID:2492306329988589Subject:Vehicle Engineering
Abstract/Summary:PDF Full Text Request
RThe development of autonomous driving technology will greatly benefit driving safety and reduce the occurrence of traffic accidents.Therefore,in modern times,the realization of autonomous driving is imperative.Since the model-based control algorithm cannot adapt to the changing driving environment,the reinforcement learning algorithm that can interact with the environment and continuously learn the environment will play an extremely important role in the field of smart cars.However,due to the certain failure problem of the reinforcement learning algorithm,This may cause serious traffic accidents,so reducing the failure of reinforcement learning algorithms in smart car applications and improving driving safety will be an indispensable part of future smart car research.This paper takes self-driving cars as the research object.It mainly studies the end-toend control algorithm for self-driving cars from real-time scenes to action output and the recognition algorithm for the current scene’s drivable area,and then whether the drivable area contains the controlled vehicle’s estimated position at the next moment(or whether the vehicle is within the drivable area at the next moment after the current action is performed)is analyzed and verified,and finally the two algorithms are mutually supervised and verified to improve the driving of the self-driving car safety.The research work of this article mainly includes the following points:Researched the driving area recognition algorithm PSPnet.Build the network structure of the driving area recognition algorithm.Based on the BDD100 k open source data set and the self-built data set of the Carl scene,the label file.json is converted into a label picture that can be recognized by the neural network,and the original scene picture is used as a set of the network structure.Training to improve the ability of the neural network to recognize the drivable area of the self-driving car scene,and then verify the effect of recognizing the drivable area through the scene pictures in the Carla environment.The end-to-end control algorithm Policy Gradient,which is suitable for self-driving cars,is studied.According to the basic principles of the reinforcement learning algorithm Policy Gradient,the Carla environment in which autonomous vehicles are located,and the characteristics of vehicle control in Carla,the Policy Gradient algorithm is written in Python to achieve end-to-end control from the real-time scene of the camera to the action output.The Gradient algorithm and the Carla simulation environment collect environmental data through TCP interaction,and put the network output value and the return value of the action in the scene into the strategy network for training,so as to continuously improve the performance of the unmanned vehicle in the environment.Designed simulation verification of multiple scenarios.The end-to-end algorithm of the self-driving car evaluated the control performance of the self-driving car in different scenarios.The required vehicle model and sensor model were configured through Python and Carla scripts.Experiments were carried out in different simulated traffic scenarios,and the matching of the vehicle speed,acceleration,angular velocity and trajectory with the endto-end algorithm Policy Gradient return function design was analyzed and evaluated,which verified the end of the self-driving car written in this article.The effectiveness of the end-toend control algorithm.Finally,the correct warning rate and false alarm rate of the driving area recognition algorithm during the entire training process of the control algorithm are used to verify the end-to-end control algorithm for driving area recognition and autonomous vehicles.Can supervise each other to improve the effectiveness of the driving safety program.
Keywords/Search Tags:PSPnet, Policy Gradient, reinforcement learning, semantic segmentation
PDF Full Text Request
Related items