Font Size: a A A

Research On The Intelligent Control Method For Urban Traffic Signal

Posted on:2008-09-08Degree:MasterType:Thesis
Country:ChinaCandidate:W PengFull Text:PDF
GTID:2178360212995787Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
This dissertation discusses the application of intelligent control method onurban traffic signal. The algorithms this dissertation uses are Q-Learning inReinforcement Learning and Back Propagation Algorithm in Artificial NeuralNetwork.According to the definition of Wooldridge, Agent is an individual that hasthe following nature: Autonomy, Goal-oriented, Reactivity, Social interaction.The problem that Reinforcement Learning mainly deal with is how to makethe Agent apperceive the environment, and to choose the optimal sequence ofaction in order to achieve its goals. Each action of Agent in the environmentcan receive a prize or punishment as a reward. Agent can learn to choose aseries of action indirectly from the delayed reward, to obtain the highestcumulative reward. Through Trial-and-error search in the space of state andaction, Reinforcement Learning uses the delayed reward from theenvironment to reinforce good action and weaken the poor, thereby optimizethe strategy. Trial-and-error search and delayed reward are the most notablefeatures of Reinforcement Learning. Q-learning is a very effective algorithmof Reinforcement Learning that has nothing to do with the model. Thisalgorithm study the action evaluation function Q(s,a) to maximize the reward.In Q-Learning, Q(s,a) is defined as: It's value is the largest convertedcumulative reward which choose a as the first action starting from state s. Inother words, the immediate reward which received by executing action a fromstate s and the value which received by following the optimal strategies(converted by the discount coefficientγ) together make up of the Q value.Assuming the set of the environment state that can be distinguished by Agentis S, the set of action that can be chosen is A.? (s, a) is state transfer function,representing the state after executing action a A at state s S, r(s, a) is∈∈thereward received after executing action a∈A at state s∈S. The task of Agent isto learn a control strategyπ:S→A. Based on the observations of the currentstate st, Agent chooses to execute action at,π(st)= at. It is hoped that thestrategyAgent learned can get the largest sum of rewards.Artificial Neural Network is an artificial network system that simulatesthe structure and functions of human nervous system and uses a large amountof processing components. Artificial Neural Network can obtain knowledgeexpressed by data through study and training. In addition to memorizing theinformation that is already known, it has strong capacity of generalization andassociation. Rumelhart,Hinton and Williams proposed the Back Propagationalgorithm for the training of forward neural network. The main idea is todivide the study process into two stages. At the first stage (signal forwardpropagation process), input signal is processed layer by layer through theinput layer and the implicit layer, at the same time calculate the actual outputof each node. At the second stage (error correction reverse propagationprocess), if the output is not the expected in the output layer, calculate theerror between the actual output and the desired output recursively layer bylayer, then correct the weight according to this error. Back Propagationalgorithm is an algorithm guided by teacher, and based on the gradient descentalgorithm. It can successfully resolve the learning problem of the linkingweight of implicit layer neuron in multi-layer neural network.Generally a medium-size city has at least tens of thousands of vehicles,and hundreds of sections and intersections. These sections and intersectionsare distributed in a broad region. Since there are vehicles running on the roadsfor most of the time, the traffic problems can arise at any time. Therandomness of urban traffic is much greater than railway and aviationtransportation. Besides the buses have regular traveling routes, taxi andpersonal traveler can freely choose routes between the start and the end. Thisbrings the traffic flow on the traffic network a very big random distribution.Urban traffic signal control relates closely to the inherent characteristics of theurban traffic. To effectively control such a large-scale, dynamic, highlyuncertain distributed system is a very complex task. Originated from the 19thcentury, traffic signal control was initially aimed at control vehicles crossingintersections, thereby avoid vehicles running in opposite directions crashinginto each other by installing the traffic lights at intersections. Later thepurpose was extended to reduce the average delay time of vehicles, andcontrol object was changed from single intersection to a broader regionconsists of several intersections. Although traffic signal control has beendeveloped largely during the last decades, the current status of traffic signalcontrol is still lagging behind the rapid expansion of cities, the growth of thetraffic network and the increase of vehicle number. This contradiction isparticularly prominent in China. Lacks of intelligence, the current controlsystem is unable to adjust to the change of traffic flow very well. Especially inprocessing traffic control in large-scale traffic network, it is markedly short ofoverall optimizing capability. Therefore, improving the intelligentmanagement and control of the existing traffic network is a very significantjob to ease road traffic pressure, reduce traffic congestions and improve thecapacity of the traffic network.Based on the previous research, this dissertation proposes a hybridalgorithm which combines the Q-learning in Reinforcement Learning andBack Propagation algorithm in Artificial Neural Network, and applies thealgorithm to the traffic signal control problem of multi-intersections based onthe main street model. In view of the traffic flow in one intersection may clashwith its adjacent intersections in the urban traffic network, in other words, theoptimization of local traffic flow might lead to the worsening situation ofother area, this dissertation adopts the overall optimization of traffic flow.Assumed that there are n intersections in the traffic network, everyintersection can be treated as an Agent, The entire traffic network is controlledby a manager Agent. Every intersection Agent has two actions: to keep thephase of green lights, and to change the phase of green lights. Manager Agentis responsible for the overall control strategy, then, passes the strategy to thecorresponding intersection Agents. The intersection Agents are responsible forexecuting the strategy. The action set of manager Agent contains 2n actions,this dissertation builds a Q value storage network for each action. To thetraffic signal control problem, as its large state space, this dissertation usesmany single-output three-layer BP neural networks to store the value ofevaluation function Q of each action, treats the state of the entire trafficnetwork as the input of Q value storage network, the corresponding Q value ofthis state-action as the output. The state of the entire traffic network isrepresented by the combination of the states of all the intersections.This dissertation analyzes the probability distribution of vehicles thatarrive at the entrance of the traffic network, establishes a mathematical modelof producing vehicles based on the Poisson distribution, gives the computeralgorithm to realize this mathematical model, and simulates the situation ofvehicles emerging and running on the road, arriving at the intersections,running out of the traffic network etc. It adopts Q-learning algorithm inReinforcement Learning to establish traffic signal control strategy, at the sametime, it adopts an improvedε-greedy strategy to make choice of controlactions, so that make the algorithm converge to the global optimization. Atlast it compares this method with the traditional fixed time control method.The experimental results show that the effect of the intelligent control methodit uses is better than the effect of the latter one. Especially when the differenceof traffic flow between horizontal and vertical directions becomes larger, thismethod can be used to improve the control effect more evidently.
Keywords/Search Tags:Intelligent
PDF Full Text Request
Related items