Font Size: a A A

The Discovery Method And System Implementation Of Urban Congestion Area Based On Distributed Clustering

Posted on:2019-02-18Degree:MasterType:Thesis
Country:ChinaCandidate:P H ZhaiFull Text:PDF
GTID:2392330611998463Subject:Computer technology
Abstract/Summary:PDF Full Text Request
Correctly and timely discovery of urban congested areas can provide traffic management departments with corresponding emergency measures and suggestions for improvement of road infrastructure construction,and can also provide very valuable urban construction guidance for municipal planning departments.The method and system realization of urban congestion area detection discussed in this thesis mainly studies how to find an effective method to identify traffic jam,and then design a quasi-real-time monitoring system that is stable,reliable and high performance based on this method.At present,there are two traditional ways to realize such systems: a bayonet-based bayonet sensing monitoring system that requires a lot of manpower and material costs,a relatively long construction period,and difficult to cover in remote places;a data analysis method based on the factors such as the speed,flow and travel time characteristics of the road section,then the traffic congestion index is calculated,and the data is finally tagged,so as to identify or predict the traffic congestion situation in the future according to historical data.This method requires basic data that exactly matches the road sections and the maps.In view of this situation,this thesis proposes a SP-Canopy-KMeans clustering method based on vehicle GPS trajectory data to identify congestion areas.However,extracting and calculating road congestion information from massive vehicle traffic data is a very complex and arduous task.The traditional single-machine algorithm is not competent,and even some distributed-based algorithms need to be optimized and improved because of the replacement of technology.In this context,this thesis starts from the very objective timeliness and persistence characteristics of the Traffic Jam,and the stream-oriented computing characteristics of the Spark Streaming,uses the taxi GPS track data of Shenzhen to find a way to calculate and analyze the characteristics of urban traffic jam and find the congestion areas on the Spark distributed platform.The main content of this thesis includes the following three aspects:(1)Technology: Make full use of the underlying open source components of the existing Internet big data platform,these components have been precipitated and tested by the market.As for data collection,Flume is used to solve massive GPS trajectory data acquisition from multiple data sources and send the data to the message queue component Kafka,which ensures high concurrency,high throughput and high performance of large data.Clustering module can process the data in Kafka and calculate the current congested area in quasi-real-time,using the Sliding Window Technology of Spark Streaming.The business layer,using the Docker container technology,is able to scale out in few seconds.The Spark eco-cluster component,managed by the CM,can expand and/or shrink the cluster nodes in about 1 minute.(2)System designing: This thesis designs a complete architecture of quasi-real-time urban congestion area detection system based on streaming computing.It includes data source acquisition,data pre-preprocessing,data mining,business computing and congestion dynamic display.This architecture supports cluster hot expansion and can be used as a template for project development.(3)Algorithm application: The SP-Canopy-KMeans clustering algorithm is designed and implemented,in which the SP-Canopy algorithm is realized by Spark parallel optimization of the Map Reduce's Canopy algorithm.This not only solves the defects of the traditional K-Means algorithm,but also makes the algorithm more efficient on Spark platform.
Keywords/Search Tags:Traffic Jam, Clustering Algorithm, Distributed System, Sliding Window
PDF Full Text Request
Related items