| With the development of the urban city, the increasingly tense traffic problems have brought tremendous loss to efficiency,energy and life. In order to effectively alleviate the problem, the main solution,Intelligent Transportation Systems attracts more and more attention from all walks of life and developed fast. The intelligent traffic light control based on reinforcement learning algorithm has become one of the main means to ease the traffic congestion, which results from its stronger adaptability to dynamic and changeable traffic network environment.Firstly, on the basis of the original control scheme,a new type of intersection cooperation solution is proposed, providing a new way of thinking for traffic control. The collaborative relationship between the adjacent intersections is not a simple information interaction, but with the help of game theory thought to find the Nash equilibrium between them, thus effectively improving the effectiveness of the cooperation. Under the decision control scheme of the markov game, intersection controller in dynamic operation of road network is not just in the process of choosing local optimal movement, but prone to choosing the selection that is more advantageous to the global control scheme through the cooperation mechanism instead. A large number of experiments show that the control scheme is better than the Maxplus control method based on collaboration diagrams.Secondly, from the perspective of the traffic trend analysis,the intelligent control schema based on the combination of forecast analysis and lane model is proposed. According to the specific historical value of each lane in the traffic network, appropriate ARIMA forecasting model will be established for the future traffic flow prediction.Based on the predicted data foundation, analyze the trend of the traffic capacity with the structure model of the dynamic flow of the lane, complete the intelligent control of the whole network.Since this method not only grasps the the trend of traffic flow,but also considers the dynamic correlation of the network itself, it works better than the original TC1 control method.At the end of the paper, considering the limitations of intelligent control algorithm inpractical application,we put forward optimization control scheme under the background of POMDP conditions,a method that is applicable to completely observable background.In the current research, assuming that the intersection controller has full access to the vehicle information of the revalant lane,thus the method is applicable to completely observable background. But in actual network, due to the limitation of the sensor itself and the external physical conditions, the influence of the information obtained intersection is, in fact, not complete, thus to realize the application of the control scheme in real traffic,it is needed to study the algorithm implementation under partially observable conditions. Based on this, this paper studies the reinforcement learning control under the POMDP conditions. We proposes a new method describing how to get the belief state of the lane on the basis of that of the car and puts forward the corresponding optimization solution. |