Font Size: a A A

Distributed Balanced Storage Management Of Massive Spatial-temporal Data Based On Graph Division

Posted on:2022-12-25Degree:MasterType:Thesis
Country:ChinaCandidate:D N ChenFull Text:PDF
GTID:2480306767466034Subject:Computer Software and Application of Computer
Abstract/Summary:PDF Full Text Request
With the development of Internet of Things and cloud computing,GIS has stepped into the era of big data.Smart phones,vehicle-mounted sensors and other data acquisition devices generate massive spatial-temporal data,which is characterized by large volume,multi-source heterogeneity and uneven space-time distribution.At present,there are many schemes tomanage massive spatial-temporal data based on distributed No SQL database,but most of them ignore the impact of data distribution on query efficiency,and lack support for load balancing of spatial-temporal data storage.At the same time,many scholars have conducted researches on data partitioning and storage load balancing in distributed environment,but the spatial-temporal data has its particularity,spatial proximity should be maintained during data partitioning,while the data in each time period must be balanced among nodes during load balancing.Otherwise,the frequent interaction between distributed computing nodes will greatly reduce the efficiency of spatial-temporal query and analysis.Therefore,how to divide the massive spatial-temporal data with uneven space-time distribution and how to balance storage load is still a problem to be solved.To address the above problems,an adaptive data partitioning method considering spatio-temporal distribution is proposed in this paper.In terms of space,sampling experiments were used to determine the threshold of grid splitting for specific application scenarios,and a level computing model based on spatial distribution was designed to determine the initial level of partitioning,so as to split and merge grids in parallel,eventually improve the efficiency of space adaptive partition.In terms of time,time period are divided according to the pattern of time tide of data,a spatial-temporal hierarchical index structure is constructed.Then,this paper studies the storage load balancing method based on graph division.In the data grouping stage,the data of multiple time periods are expressed in a uniform spatial distribution by defining the initial distribution.After the adaptive partitioning result is mapped into a undirected weighted graph structure,the open source Metis graph division algorithm is used to achieve data grouping.In the group tuning stage,the iterative fine-tuning method is designed to balances the load of data storage in each time period while maintains the spatial proximity of partition results.Finally,according to the index architecture and load balancing method proposed in this paper,the corresponding storage scheme and query algorithm are designed based on HBase database,and the coprocessor mechanism and secondary index table are used to speed up query processing.A series of comparative experiments were carried out on New York taxi data.The experimental results show that: in terms of data partitioning,the grid point threshold obtained by sampling experiment is reliable and can effectively balance the load balancing effect and query efficiency;The distributed parallel partitioning method proposed in this paper is more efficient than top-down or bottom-up partitioning and the index construction efficiency is improved by more than 20%.In terms of load balancing,although the data grouping method based on graph division is slightly inferior than the method based on Z-Order SFC,it has better ability to maintain spatial proximity.The iterative fine-tuning method can reduce the average unbalanced degree of data sets by about 30% without destroying spatial proximity,so as to realize data storage balance in each time period.In terms of query efficiency,the query method using secondary index table combined with HBase coprocessor improves the query efficiency by about 2.5 times compared to Geo Mesa in various query scenarios.
Keywords/Search Tags:Spatial-temporal point data, Data partition, Graph partition, Storage load balance, Spatial-temporal range query
PDF Full Text Request
Related items