Font Size: a A A

Research On Large Scale Trajectory Data Processing System

Posted on:2020-09-20Degree:MasterType:Thesis
Country:ChinaCandidate:J W ShanFull Text:PDF
GTID:2428330590996828Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
In recent years,with the development of positioning devices,such as GPS and mobile phones,the acquisition of location information has become easier.Mobile phones,taxis,buses,etc.generate massive trajectory data everyday.These trajectory data have the characteristics of large data volume and high dimension,and hide a lot of information that can be mined.Many technology companies such as Uber abroad,domestic Didi Chuxing,Gaode,etc.will use these trajectory data to provide location-based services,such as road recommendations,route planning,and so on.A bridge between large-scale trajectory data and trajectory data application is needed.This bridge is the trajectory data processing system,and it is also the work to be done in this thesis.Firstly,for massive trajectory data,this thesis design and proposes a trajectory data processing system,which can meet the functional requirements of trajectory data collection,processing and storage.The system has the characteristics of good expansibility,high reliability and strong real-time performance.The system as a whole can be divided into three modules,namely,collection module,processing module and storage module.In the storage module due to the partition problem of HBase,the performance of trajectory data writing is degraded.For this situation,this paper proposes a pre-partitioning strategy which combines the distribution characteristics of trajectory data with the time attribute of to reasonably set the granularity of the partition and the size of the partition.The experimental part verifies the performance improvement of the storage module by the pre-partitioning method.Secondly,this thesis implements two kinds of queries commonly used in trajectory data processing,and the ID-query and spatio-temporal range query are implemented in the storage module.The usual data storage modules all have query functions,but in the face of large-scale trajectory data,how to ensure the reliability and efficiency of the system and the efficiency of the query in a distributed environment is a challenge.For the ID-query,the system achieve it by optimizing the rowkeys with the GeoHash algorithm For the spatio-temporal query of trajectory data,this thesis is implemented by constructing a multi-level index.Finally,this thesis conducts extensive experiments on the pre-partitioning strategy of the storage module,the ID-query of the trajectory data,and the spatio-temporal query on the trajectory data processing system.The experimental results show that the large-scale trajectory data processing system proposed in this thesis can efficiently collect,process and store trajectory data The pre-partitioning strategy proposed for the storage module can significantly improve the insertion speed during the trajectory data insertion process.At the same time,the storage module can realize ID-query and spatio-temporal query of trajectory data,and the multi-level index structure also shows good performance in space-time retrieval.
Keywords/Search Tags:Trajectory Data, Big Data processing, Time and Space analysis, Index Construction
PDF Full Text Request
Related items