The signal lights at road intersections are used to indicate the orderly passage of vehicles of all parties,but the current timing control is difficult to reasonably allocate the right of way,which is prone to cause traffic congestion.Therefore,intelligent signal lights are bound to be the development trend in the future.This thesis proposes a solution using intelligent signal lights to reduce the delay of isolated intersection vehicles.This scheme monitors the real-time traffic through machine vision,and optimizes the signal timing through the deep reinforcement learning algorithm.Firstly,according to the video monitoring information of actual intersections,the detection methods of traffic flow,average speed,average queue length and other parameters are studied by using machine vision.A binary decision tree is constructed on the basis of statistical probability Hough transform to improve the accuracy of lane detection.Hu moment eigenvalue is used to match the standard lane direction profile.The checkerboard traffic flow detection area is constructed according to the broken point of dotted line to narrow the detection range of traffic flow.In the detection area,the latest YOLOX target detector and fast SORT multi-target tracking algorithm are used to accurately locate the vehicle target,and the traffic flow is counted and the speed is estimated by combining the reference line.The concept of "stationary point" is proposed,and the first order difference calculation is performed on the trajectory sequence of the target frame to determine whether the vehicles are in the queuing state.In the queuing lane,the distance distribution after the projection of the target frame on the middle line of the lane is identified to determine the vehicles at the rear of the queue,and the queuing length is obtained.Then,according to the characteristics of actual intersections,a deep reinforcement learning timing model is constructed to meet the multi-lane detector and countdown constraints,and the state representation,action space,reward function,selection strategy,network structure and other elements in the model are studied.Specifically,the traffic state matrix is constructed by using multi-lane traffic flow,speed and queue length.On the basis of Webster method to calculate best cycle time,and different split matching time is constructed as action space.Zero reward delay factor is introduced to convert delay time into reverse reward value to weaken the influence of delay value on reward.The timing scheme is selected by greedy strategy and cosine function is used to attenuate the exploration rate to increase the agent’s early exploration probability.The multi-layer fully connected neural network is used to deduce and learn the optimal timing action of the next cycle.DQN algorithm is used to train the neural network model,and Huber error function is used to accelerate the convergence speed of the network.In order to verify the performance of the algorithm,a visual timing simulation system integrating VISSIM and Py Qt is developed and compared with fixed timing.The results show that the algorithm can effectively improve the timing effect.Finally,the system test platform is designed by adopting the industrial PC as main control equipment,camera use network for transmission of real-time video streams,Lo Ra wireless way to communicate between the industrial computer and embedded controller,using the embedded controller control signals and the countdown,the transplantation of traffic monitoring and signal timing algorithm for testing.And the results suggest that the system can achieve the desired control effect.In this thesis,the research results indicate that the reinforcement learning based on the depth of timing algorithm still can choose a kind of optimal timing plan according to real-time traffic status after considering the actual environment of detector data types,the countdown display shows limit,and the safety of the signal phase switch etc.,and timing effect is better than that of fixed timing.Besides,the application scenarios of the deep reinforcement learning timing algorithm are extended,and the video monitoring scheme based on target detection and tracking can complete the task of extracting traffic states required by the timing algorithm in this thesis,which makes the research results of this thesis have a certain practical application prospect. |