Font Size: a A A

Research On Mining Taxi Pick-up Hotspots Based On Spatial Cluster And Weka

Posted on:2015-01-11Degree:MasterType:Thesis
Country:ChinaCandidate:P P LiuFull Text:PDF
GTID:2268330428998080Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Now, almost each taxi is fitted with a GPS dispatching terminal. These devices will sendreal-time status information to the taxi dispatching center about every1second, such asposition, velocity and passenger status. It is currently a hot topic about how to dig out theuseful information from these huge taxi operational data and to guide the optimal dispatchingfor a taxi company.By studying the situation and development of taxi dispatching system from domestic andoverseas, We find that most of these scheduling systems are based on static historical data,coarse-grained, centralized scheduling scheme, and some are artificial under telephonereservation mode. The current dispatching systems are always lag and fuzziness, and when therequest is large, dispatching center often works overtime and eventually the scheduling effectis not ideal in the end. It even leads the most taxi drivers to choose picking up guests blindrandomly.According to the above analysis, combined with spatial clustering technology, this paperpresents a distributed and dynamic scheduling based on taxi pick-up hotspots; A improvedspatial clustering algorithm R-FDBSCAN is proposed, which has an addtional parameter Rcontrolling the range of clusterers, to fine-grained cluster the history GPS data of the taxi. Itintegrates algorithm R-FDBSCAN in Weka platform and digs out the hot pick-up spots fromthe taxi data of Beijing, specific relative works are as follows:1. In this paper, it puts forward a new scheduling scheme based on the taxi pick-uphotspots. This proposed scheme needs to spatial cluster on the taxi GPS history data andexcavate fine-grained passenger hotspots; Two concepts, the center of mass and heat, aredefined respectively as the location of the taxi pick-up hotspots and the degree of demand forthe taxi; The ultimated knowledge about pick-up hotspots after reduction, stored in a taxidispatching terminal, is used to implement quick offline scheduling and real-time dynamicscheduling; This proposed taxi dispatching scheme will reduce the mass works for the taxicontrolling center.2. For achieving the even and fine-grained clusterers about the historical data of taxi GPS,this paper proposes an improved R-FDBSCAN spatial clustering algorithm. Through theanalysis of the common clustering algorithms, we find that the classical DBSCAN clusteringalgorithm based on density is short of memoriy requirements and execution efficiency, whileits fast version clustering algorithm FDBSCAN can not make even and fine-grained clusteringabout taxi GPS history data. Therefore, this paper puts forwards the R-FDBSCAN algorithm with a range of control by increasing a parameter R. When choosing representativeseed-objects to expand clusterers, it uses parameter R to judge whether the seeds continue toexpand or not, ultimately the R-FDBSCAN controls the scope of clusterers in the circular areaof radius R, so as to meet the requirements of the taxi scheduling. Experimental results showthat compared with DBSCAN algorithm and FDBSCAN algorithm, R-FDBSCAN algorithmhas certain advantages on time performance and the clustering results, and with the increaseof R value, its restrictions on the propagation of clustering are looser. Eventually the numberof such clusterers becomes smaller. When R exceeds a certain value, it will degenerate intoFDBSCAN algorithm;3. This dissertation integrates R-FDBSCAN algorithm with data mining platform Wekaand makes the pick-up hotspots excavate using the above proposed algorithm andvisualizational analysis. Through analysing the secondary development interfaces whichprovided by Weka, we realize R-FDBSCAN algorithm on this platform. On the basis of theseworks, we make statistics about4days of GPS data of12000taxis in Beijing and divide thecharacteristics of the normal working days and holidays into different periods as well asdoes spatial clustering for each time period; By matching the reduction of taxi passengerhotspots in ArcGIS with live map, according to the hierarchical heat value with different color,combined with daily regular analysis of the high temperature region from residents travel, weverify the feasibility of using the center of mass and heat to reduct the hot spots, so as torealize a distribution and dynamic scheduling.The proposed distributed dynamic taxi scheduling scheme based on taxi pick-up hotspotswill guide the taxi drivers to go to the area of high demand accurately and rapidly, and itdistributes the dispatching task to each GPS terminal, reduces the workload of dispatchingcenter and advances the revenue of the whole taxi business ultimately; The improvedalgorithm R-FDBSCAN in this dissertation can come up with even and fine-grained spatialdata clustering, and with the taxi passenger hotspots in this paper, it can also be used foranalysis of residents daily behaviors, guiding the government to plan city’s transportationbuilding reasonably, seting the stable wave-waiting taxi spots and so on.
Keywords/Search Tags:taxi dispatching scheme, pick-up hotspots, spatial clustering, R-FDBSCAN, Weka
PDF Full Text Request
Related items