Font Size: a A A

Mapreduce-based Urban Transportat Ion Distribution Outlier Detection And Analysis

Posted on:2015-08-04Degree:MasterType:Thesis
Country:ChinaCandidate:Y XiangFull Text:PDF
GTID:2298330452453211Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the rapid development of spatio-temporal trajectory data mining, trajectorydata outlier detection has become a research hotspot in the field of data mining. Anumber of traditional outlier detection methods are based on Euclidean spaceenvironment in the detection of trajectory data outliers, which represents the outliersas points that are certain distances away from the major points. However, in thepractical application of traffic outlier emergency response, the traffic outlier detectionis mainly judged via the changes of the traffic flow, and the method of measuringoutliers with Euclidean distances adopted in the traditional outlier detectionalgorithms is no longer applicable. In addition, because of the huge amount of traffictrajectory data, the application of traditional stand-alone outlier detection method hasa low efficiency in operation. In this paper, MapReduce distributed parallelcomputation framework is used to propose a MapReduce-based urban transportationdistribution outlier detection and analysis algorithm. The details are as follows:(1) In order to better describe the transportation distribution, this paperproposes a urban transportation distribution model that based on community trafficflow. The model is simple and easy to understand, which can macroscopically showthe transportation distribution conditions of the whole city.(2) For the problem of traffic outlier detection, this paper combines theknowledge of Transportation field and presents a definition of transportation distribution outlier based on community traffic flow, and gives a formal representation.(3) On the basis of the above, this paper proposes a MapReduce-baseddistributed parallel traffic outlier detection and analysis algorithm (hereinafter referredto as the MDPTDODA algorithm), which preprocesses the taxi trajectory data first,and then extracts inter-community traffic flow from the taxi trajectory data, sets up aurban transportation distribution model based on community traffic flow. At last,through the integration of traffic flows in consecutive days, builds time series sets,uses DBSCAN to detect traffic outliers, and analyzes the possible causes of theoutliers based on the relationship between the transportation distribution outliers. Adopting taxi historical trajectory data in Beijing as the original data, andcarrying out experiments on the stand-alone version and the distributed parallelversion of the test methods respectively in a stand-alone multi-core environment and aHadoop-based cluster environment, this paper proves the high efficiency of theMDPTDODA algorithm proposed herein in the analysis and processing of a hugeamount of trajectory data. Meanwhile, this paper compares the experimental resultswith the historical physical truth, which turns out that the method is effective indetection and analysis of outliers.
Keywords/Search Tags:Data Mining, Transportation Distribution Model, Outlier Detection, MapReduce
PDF Full Text Request
Related items