As China’s urbanization proceeds rapidly,the number of motor vehicles is continuously increasing.This leads to more severe urban traffic congestion.Adaptive traffic signals can adjust the signal timing at the intersections in real-time according to the traffic flow,thus securing the orderly vehicle passage and improving the efficiency of transportation.In recent years,traffic signal control methods based on deep reinforcement learning(DRL)have combined deep neural networks and reinforcement learning strategies,effectively addressing high-dimensional data and continuous state issues in the traffic scenarios and solving the model-free dynamic programming problem in traffic signal control.The present-day DRL-based traffic signal control methods mainly consist of single-agent and multi-agent approaches.Firstly,a single-agent DRL method and a centralized multi-agent DRL method for single urban intersection signal control have been studied.Secondly,the scope of the study to multiple intersections in urban road networks are expanded and then employ the distributed multi-agent methods for collaborative signal control.Our main contributions are:(1)A DDQN-based traffic signal control framework is proposed in this thesis.First,the real-time traffic flow and the vehicle queue information are obtained on the basis of mathematical model,and then define the states,actions,and rewards for DRL method.Under the condition of the same state,action,and reward definitions,both the DDQN and the comparative DQN networks are trained.Finally,experiments are conducted in a simulated single intersection environment,and the proposed method is compared with other methods.The results demonstrate that the proposed method converges more quickly and exhibits superior control performance.(2)An A2C_RTQL model based on centralized multi-agent learning for single intersection signal control is proposed.The LWR shockwave principle is utilized to determine the length of the real-time vehicle queue in each incoming lane at a single intersection,and the result is then used as the state and reward for the multi-agent model.The single intersection environment is divided into multiple parallel environments,and a global agent is used to control the signals.The simulation results indicate that the proposed method allows for more detailed observation of the intersection environment.So the method is superior in performance to the single-agent reinforcement learning models in single intersection control.(3)A MA2C_AS model based on multi-agent distributed learning is proposed.Firstly,the subgraph reconstruction algorithm is employed to adaptively perceive the degree of influence of target intersections by other network intersections,thus eliminating irrelevant nodes and constructing a subgraph to reduce interference from the distant intersections.Then the data sharing among multiple agents is achieved within the subgraph to optimize globally the traffic signals through the cooperative adaptive control.The simulation experiments show that the control performance is excellent in the scenarios of complex road networks and small-scale and simple traffic flows.Hence the algorithm’s transferability and generalization capabilities are validated in two scenarios of distinct road network. |