Research On Taxi Trajectory Organization Method Based On Apache Spark

Posted on:2018-11-20

Degree:Master

Type:Thesis

Country:China

Candidate:Y T Jia

Full Text:PDF

GTID:2348330518992103

Subject:Cartography and Geographic Information System

Abstract/Summary:

PDF Full Text Request

Trajectory is a kind of spatial-temporal data, it has been increasingly applied to road change detection, travel mode exploration and urban hotspot analysis and other research topics in recent years. With the increasing amount of data, the rapid retrieval of trajectories is more and more important. The traditional spatial-temporal indexing usually extends the two-dimensional spatial-temporal index by increasing the time dimension or time snapshot. Different from the common features, the trajectory data are composed of a large number of line segments, it would cause a lot of problems if the ordinary spatial-temporal index directly applied to the Trajectory.In recent years, Hadoop has developed rapidly and provides a good distributed platform for the processing of massive trajectory data. However, in the calculation with a large number of iterations,the platform has some shortcomings, resulting in lower computational efficiency. In order to improve the establishment and query speed of trajectory index, this paper proposes a distributed vehicle trajectory index method based on Apache Spark. This method divides the trajectory data into RDDs to parallel processing, uses the map matching result to establish the road-vehicle index, then combines the grid index of the road to realize the locale of the trajectory. It could make full use of the computing performance of each machine in the cluster, and set the cluster size flexibly according to the amount of data. This paper mainly studies from three aspects: the processing of trajectory data and the use of map matching to establish road-vehicle association, index structure establishment and query method, data scheduling method and distributed processing strategy.1) Trajectory data processing and map matching methodBased on Apache Spark, this paper studies the error elimination and trajectory segmentation method of taxi trajectory data and the method of distributed map matching for trajectories using hidden Markov model. Then form the association between based on the map-matching result.2) The Establishment and Query Method of IndexBased on the result of map matching, the establishment algorithm of road - vehicle index is studied. Combined with the grid index of road, the secondary index structure of trajectory data is realized.3) Data scheduling methods and distributed processing methodsThis paper studies the establishment, storage and query strategy of index on distributed platform, and optimizes the stages of the research process for different cluster configurations.This article uses the track of the taxi in Beijing in November 2012 and the OSM map of Beijing, and optimizes the experimental process of the paper. The experimental results show that the proposed method can effectively query the trajectory and provide effective support for the analysis of trajectory data.

Keywords/Search Tags:

Trajectory data, Spatial-temporal Index, Apache Spark, Map-matching, Organization Method

PDF Full Text Request

Related items

1	Temporal Query Analysis And Temporal Index Optimization Based On Apache Spark
2	Research On Optimization Of Trajectory Data Query In Big Data Environment
3	Research On Algorithms For Spatial-temporal Anomaly Trajectory Detection In Cloud Computing Environment
4	A Spatial Data Model Based On Rdbms And Its Special Application
5	Research On Algorithms For Spatial-Temporal Trajectory Outlier Detection In Cloud Computing Environment
6	Spatial-Temporal Data Mining Based On GPS Trajectory And Geo-Tagged Photo Trajectory
7	Research On Algorithm Of Road-network Aware Spatial-temporal Trajectory Clustering
8	Research On NoSQL Database For Trajectory Big Data Storage And Query
9	Research On K-Prototypes Algorithm Based On Mixed Data And Implementation Of Spark Platform
10	Trajectory Similarity Algorithms And Applications Based On Spatial And Temporal Data