Font Size: a A A

Spatio-temporal Retrieval Method And Visualization Of Cab Big Data Based On Spark

Posted on:2024-08-22Degree:MasterType:Thesis
Country:ChinaCandidate:Z X YaoFull Text:PDF
GTID:2542307076998039Subject:Cartography and Geographic Information Engineering
Abstract/Summary:PDF Full Text Request
In recent years,with the frequent travel of residents,the diversification of transportation means and the intensive collection of travel data,the trajectory data has been growing explosively.In the process of collecting,managing and applying these data,big data technology processing and distributed database storage are applicable due to the fast update speed and huge data volume of trajectory big data.However,the trajectory big data is unevenly distributed and has spatio-temporal characteristics,which leads to the problems of hot data writing,skewed storage,high I/O overhead and slow retrieval speed in the process of organization and management.In order to solve the above problems,this paper conducts an in-depth study on the storage,spatio-temporal indexing and visualization of trajectory big data,digs into the technical principles of big data technology and distributed database storage,builds a trajectory big data model that integrates data partitioning and spatio-temporal multi-angle hierarchical organization,and further researches and implements the spatio-temporal retrieval method and interactive visualization method of trajectory big data.In this paper,we use Xiamen cab track big data as the data base,deploy distributed clusters,organize and manage the massive track data from temporal and spatial perspectives based on Spark computing framework,explore the spatio-temporal retrieval method,and combine various visualization techniques to explore the information contained in the track.The research in this paper is summarized in the following three main areas:(1)Building a data storage model that incorporates data partitioning and spatio-temporal multidimensional hierarchical organizationA Hadoop distributed computing cluster with one master node and two slave nodes is built to complete the pre-processing and warehousing of trajectory big data based on Spark computing framework.At the spatial level,the algorithmic process of partitioning the trajectory data based on Hilbert curve is explored,and the problem of hot writing and storage skewing of distributed database HBase is solved by combining with the pre-partitioning mechanism.At the temporal level,this paper takes days as the organizational management unit,finely encodes storage by minute system,forms a global spatio-temporal subdivision scheme,and proposes a data storage model based on spatio-temporal multi-angle hierarchical organization.The model substantially improves the storage and computation efficiency of trajectory big data,and can provide efficient data management model support for trajectory big data mining and analysis.(2)Designing row key structures,constructing spatio-temporal hybrid codes,and exploring spatio-temporal retrieval patternsBased on the indexing rules of HBase,we design the row key structure,and then fuse the data partitioning to construct a spatio-temporal hybrid code to store the multidimensional spatio-temporal data down to the distributed database HBase to compare the writing speed.Explore the exact point query and spatio-temporal range query patterns,and show good query performance by comparing the retrieval efficiency.Organize and manage the trajectory big data in a hierarchical way to solve the problem of efficient storage and retrieval of trajectory big data.The experimental results show that the index exhibits good query performance in both exact point query and spatio-temporal range query.The model can effectively improve the retrieval speed of trajectory big data under different orders of magnitude,while ensuring relatively stable write and query speed,which can provide efficient retrieval for trajectory big data mining and analysis.(3)Build a cab track big data visualization platform with complete separation of front and back endsAccording to the user’s demand for cab track big data use requirements analysis,design interaction visualization method.The back-end is based on Spring Boots and My Batis combination framework,and the front-end is combined with Vue2.0architecture to build a cab track big data visualization platform with complete separation of front and back-end.The back-end is based on the data storage model of fused data partitioning and spatio-temporal multidimensional hierarchical organization,and retrieves precise point data and spatio-temporal range data through spatio-temporal hybrid index,transfers them to the front-end,and completes interactive visualization by combining Echarts,inverse heat map and other methods.The whole process facilitates the visualization of analysis results and the exploration of the spatio-temporal distribution characteristics of cabs,and plays the role of platform support and application demonstration for the visualization and analysis of multi-source heterogeneous massive spatio-temporal big data.
Keywords/Search Tags:Trajectory data, Spatio-temporal encoding, Distributed columnar storage, Hilbert partitioning, Interactive visualization
PDF Full Text Request
Related items