| Adaptive Cruise Control(ACC)system is an advanced driver assistance system(ADAS)that automatically maintains a safe distance between vehicles on the basis of traditional cruise control.Therefore,the main function of the ACC system is to follow the vehicles.By measuring the state of the preceding vehicles through sensors such as radar,The ACC system controls the driving or braking device of the vehicle to realize automatic adjustment of the vehicle velocity at the same time,which ensures that the vehicle follows the preceding vehicles within a safe distance.However,the following function of the current rule-based ACC system does not consider the factors of the driver’s driving characteristics,which will lead that the fixed acceleration strategy or the relative distance cannot be adapted to different types of drivers,reducing driving comfort,driver acceptance and satisfaction of ACC systems.Therefore,considering the driving characteristics of the driver and increasing the comfort is an important research direction of the future ACC system,which is also one of the key technologies of the future intelligent vehicles.For this reason,this paper conducts research on personalized adaptive cruise control strategy based on reinforcement learning,aiming to integrate the driving characteristics of individual drivers into the decision-making of ACC,so that ACC can be personified when following the vehicles and increase driving comfort.The main research contents and results are as follows.(1)A real vehicle driving data collection system was built,the data collection conditions were designed considering the typical working conditions of ACC system,and driving data was collected under urban roads.The collected data was preprocessed.The effective vehicle-following segment data was finally extracted,and the typical characteristics of the driving data were analyzed.(2)A traditional ACC control strategy was established.The traditional ACC control strategy framework was analyzed,and the layered control method was adopted to divide the ACC into upper-level decision control and lower-level execution control.In the upper-level decision control,the cruise mode adopted P control,and the following mode adopted the model prediction control as the decision algorithm.The lower layer executes the control decision to built the vehicle longitudinal dynamics model,and realized the control of vehicle acceleration and deceleration based on the dynamics model.Finally,the traditional ACC control strategy was simulated and verified.(3)Under the traditional ACC control framework,a personalized ACC control strategy was designed using a reinforcement learning algorithm.The decision-making process of the driver following the vehicle was analyzed,and the following process was abstracted into the Markov Decision Process,and a reinforcement learning theoretical framework was established.Based on the driver’s following characteristics,the Nature Deep Q Learning algorithm under the reinforcement learning theory was used to realize the hierarchical control of relative distance and vehicle speed.Finally,Considering the three indicators of vehicle following,comfort and safety in the ACC control strategy,the corresponding reward function was designed.(4)Based on the personalized ACC control algorithm,the network structure of the algorithm and the training process of experience replay were optimized by using Dueling-DQN theory and Prioritized Replay DQN theory respectively,thereby improving the performance and training process of the algorithm.In addition,the anthropomorphic reward function determined the decision-making direction of the personalized ACC.Referring to the reward function proposed above,the inverse reinforcement learning theory was used to learn the reward function from the collected data,and a data-based anthropomorphic reinforcement learning reward function was established so that the algorithm decision trajectory was close to the driver’s decision trajectory in the actual data.A MATLAB/Simulink-Carsim co-simulation platform was built to simulate and test the trained personalized ACC strategy.The following process of personalized ACC and traditional ACC were compared and quantitatively analyzed.The results show that the personalized ACC strategy was more able to follow the driver’s habitual law than the traditional ACC strategy in following a vehicle,and it can better consider the driver’s characteristics in decision-making.At the same time,through the analysis of performance indicators,it was finally shown that the personalized ACC control strategy had more advantages than the traditional ACC in improving the following performance such as driving comfort. |