Font Size: a A A

Parallel Method For Big Spatial Data Processing With The Consideration Of Spatial Subdomain Distribution In Cloud Environment

Posted on:2018-03-26Degree:DoctorType:Dissertation
Country:ChinaCandidate:X W ZhaoFull Text:PDF
GTID:1318330512485493Subject:Cartography and Geographic Information System
Abstract/Summary:PDF Full Text Request
The vigorous development of aerial stereoscopic observation and mobile Internet techniques has greatly promoted the growth of big spatial data,forcing the spatial analysis and computing paradigm to change from centralized processing,single-person interaction to high expansibility,high efficiency and multi-source data processing.Using cloud computing resources to achieve parallel processing of big spatial data is an important way to accomplish this paradigm shift.The parallel computing paradigm in the cloud environment is essentially a single instruction multiple data stream model,which requires that the data set be divided into separate,non-shared portions for parallel processing.However,spatial data has the characteristics of heterogeneity,uneven distribution and strong entity correlation,which leads to the fact that spatial data can not be directly segmented to adapt to the parallel computing paradigm in cloud environment.Most of the traditional parallel methods for spatial data processing are oriented to specific application scenarios,lack of consideration of spatial entity association and distribution characteristics,and can not form a parallel computing method system including spatial data storage,partitioning and efficiency optimization.To solve the above problems,this paper has carried out the research on parallel method for big spatial data processing with the consideration of spatial subdomain distribution.The data partitioning strategies for spatial operations with different subdomain distribution characteristics are put forward.Taking the spatial computing scene as an example,the correctness and efficiency of the proposed methods are verified by the real big spatial datasets.This paper can provide the supports and case references for the processing of the big spatial data in the cloud environment.The contents of this paper are summarized as follows:(1)Based on the requirements of parallel computing paradigm in cloud environment,a unified method for spatial data organization and storage is designed,a general workflow for parallel spatial computing is built.The classification of spatial operations and its spatial subdomain distribution for data partitioning are studied.The workload evaluation method for spatial subdomain is proposed.(2)According to the above methods,two general data partitioning methods for local spatial operations are designed:the default subdomain-based decomposing and the grid subdomain-based decomposing.The parallel drawing methods for spatial heat map and pyramid vector map are realized based on these two methods.The applicability and efficiency of the methods are verified by billions of global points of interest and millions of vector polygons as test data.(3)This paper summarizes three regular spatial subdomain distributions of the neighborhood spatial operations:the range distribution,the spatio-temporal range distribution,and the distribution caused by the heterogeneous data superposition error.We respectively designed parallel methods for the spatial operations with the above three kinds of subdomain distributions.The applicability and efficiency of the methods are verified by the applications of spatial distance join,spatio-temporal hot spot analysis and large-scale three-dimensional surface area calculation.(4)Aiming at the characteristics of neighborhood spatial operations with irregular sundomain distribution,two methods for determining the range of spatial sundomain are proposed:the uniform grid expansion method and the Voronoi-based method.On this basis,the parallelization algorithm of K nearest neighborhood join is realized.The applicability and efficiency of the two methods are compared by performance experiment.
Keywords/Search Tags:Big spatial data, Spatial subdomain, Neighborhood spatial operations, Spatial partitioning
PDF Full Text Request
Related items