| With the rapid development of economic globalization and the continuous expansion of urbanization,the number of cars is increasing day by day,and the urban traffic network is becoming more and more complex.This development trend has led to increasingly serious problems such as traffic congestion,road safety,air pollution and other problems,and new solutions are urgently needed.In recent years,driven by both technological innovation and market demand,the automotive industry is also moving towards the direction of intelligent development.Relying on autonomous driving technology as the core,intelligent vehicles can efficiently alleviate traffic congestion,enhance road safety and reduce air pollution,bringing significant changes to the automotive field and driving the development of related industries,as well as bringing a new lifestyle and experience for consumers.The traditional autonomous driving scheme is based on a modular system,focusing on dividing the driving task into standard modules and connecting these independent modules using a rule-based method.Such modular systems require significant effort to design architectures that combine all system components.Once an error occurs it is also easy to spread throughout the architecture,resulting in dangerous situations in the driving of the vehicle.In this paper,aiming at the problem that the rule base construction of behavior decision algorithm is complex and the decision performance is limited by the upper and lower modules,through the analysis and research of existing technologies,the end-to-end behavior decision agent based on deep reinforcement learning is taken as the core,and the scene recognition module,vehicle agent library and action compensation module are designed.An end-to-end behavior decision algorithm for intelligent vehicle based on path constraints is proposed.Firstly,due to the issue of sparse rewards in deep reinforcement learning,which leads to low learning efficiency of end-to-end behavior decision agent.Therefore,a method was proposed in the model construction of vehicle agent that incorporates path constraints into the reward function design.The reward mechanism is improved by designing the road center distance and heading declination in the path constraint as vehicle guidance rewards and penalties,so that the vehicle intelligences get efficient and timely rewards and penalties in the interaction with the environment,thereby accelerating the training speed of the agent,enhancing the utilization of training samples,and improving the decision-making performance of the agent.The state space of the vehicle agent fully considers the directly relevant and indirectly relevant information in the driving state information,and replaces the high-precision information that is difficult to obtain with the more original perceptual information.Secondly,the safety of intelligent vehicle cannot be guaranteed due to the inexplicability of the output results of the end-to-end behavioral decision agent.Therefore,this paper designs an action compensation module based on path constraints.For different driving conditions,the lateral output of the vehicle agent behavior decision is compensated according to the heading declination constraint to improve the reliability of the agent decision control,so that the final output action meets the intelligent vehicle safety and comfort requirements.In order to solve the generalization problem of multiple driving scenarios,the scene recognition module is designed to identify different driving scenarios.The vehicle agent library is established and combined with the scene recognition module to automatically switch vehicle intelligences for end-to-end behavior decision according to different scenarios.Finally,the training and tuning of the vehicle agent are completed through the CARLA simulation platform,and the interpretability of the vehicle agent is analyzed by visual convolution feature map and class activation map,which effectively explains the working principle and decision-making process of the vehicle agent.The closedloop evaluation task is set to evaluate the end-to-end behavior decision system.The intelligent vehicle can complete the automatic driving task quickly,safely and smoothly.The effectiveness of the action compensation module for securing vehicle driving safety is verified by comparing vehicle driving indicators.Through the analysis of two scenario decision examples,the effectiveness of the end-to-end behavior decision algorithm for intelligent vehicles based on path constraints proposed in this paper is verified. |