| As the basic unit of urban road network,signal-controlled intersection has the function of separating traffic flow in time and space,which plays a vital role in alleviating traffic congestion and improving road capacity.Regional signal coordination control has become a key research issue in traffic management.In this paper,based on the full consideration of traffic flow and signal timing parameters,the correlation degree model between intersections is established,and the improved Newman algorithm is used to realize the dynamic division of traffic control subareas.By improving the state representation method,and reasonably defining the action and reward function,the signal timing optimization of single point intersection is realized.On this basis,through inter-agent communication and adding the traffic flow information of the section in the coordinated direction,an agent control model based on communication is designed.The multi-agent regional signal coordination model is complemented by the control model independent of agent communication,so as to realize the trunk and regional signal coordination.Firstly,this paper reviews and summarizes the research status of traffic control subdivision and traffic signal control based on deep reinforcement learning in recent years,and expounds the research content and structure of this paper.This paper introduces the theory of traffic signal control,subdivision,reinforcement learning and the algorithm of deep reinforcement learning.Secondly,a sub-area division method based on improved Newman algorithm is proposed.Based on the comprehensive consideration of intersections distance,queue length,travel time,path flow,signal cycle duration and other factors,the correlation between intersections was quantitatively analyzed,and a path-cycle correlation degree model was established.The results show that after the improved Newman algorithm is used to divide the traffic control subareas,the number of stops on the coordinated section of the road network is reduced by about 39%,and the average vehicle delay is reduced by about 55%.Compared with the traditional Newman algorithm,the number of stops is reduced by about25%,and the average vehicle delay is reduced by about 13%.Thirdly,on the basis of representing intersection status based on discrete traffic status code,this paper takes the vector composed of vehicle distribution and speed information at the entrance lane as state input,and improves the decision-making mode of DQN algorithm to design a deep reinforcement learning traffic signal control model DQN_Vector based on vector representation status,and realizes the adaptive control of traffic signals.The experimental results show that DQN_Vector can achieve a better control effect.Compared with timing signal control,the average vehicle delay is reduced by about 21.4%,and the parking delay is reduced by about 28.0%.Compared with the traditional deep reinforcement learning traffic signal control model DQN_Matrix based on matrix representation state,the computational speed is improved by about 52.4% under the condition of similar control effect.In addition,the traffic flow information of upstream section in the coordination direction was added into the state space of single point control,and a communication independent multi-agent trunk signal coordination model was designed to compare the influence of state definition,inter-agent communication and training mode on the training effect.The results show that compared with the timing signal coordination control mode,the average delay of trunk line vehicles decreases by 18.6%,the number of stops decreases by5.3%,the average travel speed increases by 11.4% from south to north and 6.6% from north to south in the coordinated direction.Finally,combined with the characteristics of various models,a multi-agent regional signal coordination model is designed,which is mainly based on the communication agent control model and supplemented by the control model independent of agent communication.When the device is normal,it can share state information through inter-agent communication to realize regional signal coordination.When the agent fails,timing control is adopted at the faulty intersection,and the rest intersections achieve regional signal coordination by obtaining the traffic flow information of the upstream section in the coordinated direction,which ensures the coordination effect and increases the overall reliability of the system.The experimental results show that compared with the timing control,the multi-agent regional signal coordination model can reduce the average delay of the road network vehicles by28.4%,reduce the number of stops by 10.9%,and increase the speed in the coordinated direction by 8% under the condition of fixed flow.When the flow rate changes,it can reduce the average delay of road network vehicles by about 29.2%,reduce the number of stops by about 7.9%,and increase the speed in the coordinated direction by about 8.8%.Even if some agents fail and need to use timing control instead,the multi-agent regional signal coordination model can still reduce the average delay of road network vehicles by 22.7%,reduce the number of stops by 11.5%,and increase the speed in the coordinated direction by 6.0%... |