| Decision-making and planning(DMAP)are crucial components of autonomous driving systems.However,designing driving strategies for complex scenarios,such as merging on highways and going through bottlenecks still present challenges.In such scenarios,vehicles’ decision-making is significantly influenced by the behaviors of surrounding vehicles,and irrational driving behaviors can negatively impact the overall traffic flow efficiency.Traditional methods are ineffective as they cannot consider interactions between vehicles.Reinforcement learning has advantages in handling complex environments and time-series decision-making problems,and are widely applied in the field of autonomous driving.However,existing reinforcement learning-based DMAP algorithms have limitations,such as focusing on the interests of the ego vehicle while neglecting overall interests and existing conflicts in complex scenarios.To address these issues,a concept from psychology,Social Value Orientation(SVO),is introduced to the autonomous driving domain to design driving strategies that have social cooperation,potentially improving overall performance.This paper aims to design a DMAP strategy from perspectives of single-vehicle and multi-vehicle,to improve the performance of ego-vehicle while considering the performance of the overall traffic flow.This article proposes an SVO selection-based single-vehicle driving algorithm Select SVO.Single-vehicle intelligence is currently the mainstream approach in the field of autonomous driving.Existing single-vehicle driving algorithms only consider optimizing the performance of the ego-vehicle,which leads to a decrease in overall traffic flow efficiency in complex scenarios with high-density traffic.Therefore,this paper introduces the Select SVO algorithm,based on SVO,which optimizes the performance of ego-vehicle while improving the overall traffic flow efficiency.This mainly includes a dynamic reward function based on SVO and a network structure based on the Deep Set model and attention mechanism.Compared to the other two algorithms,our algorithm successfully coordinates conflicts between the interests of ego-vehicle and other vehicles,improving the overall traffic flow performance.This article proposes a two-stage based SVO recognition algorithm Recog SVO.Designing autonomous driving strategies from the perspective of multi-vehicle is reasonable.Multi-agent reinforcement learning(MARL)has become a promising solution for constructing multi-agent autonomous driving systems in complex scenarios.However,most methods assume that the behaviors of vehicles are selfish,leading to conflicts.Some existing methods incorporate SVO to promote coordination,but lack knowledge of other vehicles’ SVOs,resulting in conservative behaviors.This paper aims to solve above problems by enabling vehicles to understand other vehicles’ SVOs.To this end,a two-stage system framework is proposed.Firstly,a coordinated traffic flow is established by allowing vehicles to share their true SVOs.Secondly,a SVO recognition network is designed to estimate SVOs of vehicles and combined with the policy trained in the first stage to construct the final driving strategy–Recog SVO.Experiments show that compared to other three algorithms,Recog SVO significantly improves overall performance,such as a 12.5% success rate improvement in highway merging scenario. |