Font Size: a A A

Research On Taxi Trajectory Organization Method Based On Apache Spark

Posted on:2018-11-20Degree:MasterType:Thesis
Country:ChinaCandidate:Y T JiaFull Text:PDF
GTID:2348330518992103Subject:Cartography and Geographic Information System
Abstract/Summary:PDF Full Text Request
Trajectory is a kind of spatial-temporal data, it has been increasingly applied to road change detection, travel mode exploration and urban hotspot analysis and other research topics in recent years. With the increasing amount of data, the rapid retrieval of trajectories is more and more important. The traditional spatial-temporal indexing usually extends the two-dimensional spatial-temporal index by increasing the time dimension or time snapshot. Different from the common features, the trajectory data are composed of a large number of line segments, it would cause a lot of problems if the ordinary spatial-temporal index directly applied to the Trajectory.In recent years, Hadoop has developed rapidly and provides a good distributed platform for the processing of massive trajectory data. However, in the calculation with a large number of iterations,the platform has some shortcomings, resulting in lower computational efficiency. In order to improve the establishment and query speed of trajectory index, this paper proposes a distributed vehicle trajectory index method based on Apache Spark. This method divides the trajectory data into RDDs to parallel processing, uses the map matching result to establish the road-vehicle index, then combines the grid index of the road to realize the locale of the trajectory. It could make full use of the computing performance of each machine in the cluster, and set the cluster size flexibly according to the amount of data. This paper mainly studies from three aspects: the processing of trajectory data and the use of map matching to establish road-vehicle association, index structure establishment and query method, data scheduling method and distributed processing strategy.1) Trajectory data processing and map matching methodBased on Apache Spark, this paper studies the error elimination and trajectory segmentation method of taxi trajectory data and the method of distributed map matching for trajectories using hidden Markov model. Then form the association between based on the map-matching result.2) The Establishment and Query Method of IndexBased on the result of map matching, the establishment algorithm of road - vehicle index is studied. Combined with the grid index of road, the secondary index structure of trajectory data is realized.3) Data scheduling methods and distributed processing methodsThis paper studies the establishment, storage and query strategy of index on distributed platform, and optimizes the stages of the research process for different cluster configurations.This article uses the track of the taxi in Beijing in November 2012 and the OSM map of Beijing, and optimizes the experimental process of the paper. The experimental results show that the proposed method can effectively query the trajectory and provide effective support for the analysis of trajectory data.
Keywords/Search Tags:Trajectory data, Spatial-temporal Index, Apache Spark, Map-matching, Organization Method
PDF Full Text Request
Related items