Font Size: a A A

Spatial Data Analysis Based On Distributed In-memory Computing

Posted on:2017-11-11Degree:MasterType:Thesis
Country:ChinaCandidate:N GuoFull Text:PDF
GTID:2428330569498691Subject:Information and Communication Engineering
Abstract/Summary:PDF Full Text Request
With the rapid development of spatial data acquisition technology,the scale of spatial data has been accumulated at an unprecedented speed.The traditional storage,organization,processing and analysis solutions of spatial data can no longer address the needs of current applications.Introducing the latest technological developments of computer systems to the field of geographic information has become an ineluctable trend.Along with the development of high performance computing,distributed parallel computing and memory computing technology provide new opportunities for the analysis and processing of large-scale spatial data.In order to make full use of computing and storage resources of distributed cluster environment,there are still many theories and methods left to be studied,such as spatial data model,spatial index structure,query execution strategy and task scheduling algorithm,which are adapted to the characteristics of distributed architecture.In particular,making full use of the large amount of memory resources distributed on each node is the key point of efficiency improvement.In this paper,the characteristics of current distributed computing architecture are taken into account in the designing and implementation of the Key-Value spatial data model,and the model is proved to have superior memory adaptability and index extensibility.The methods of data partitioning are discussed afterwards and an adaptive Geohash coding method is proposed as a distributed spatial coding index which could support multiple spatial data types.An adaptive spatial grid index is also implemented to realize the efficient query of spatial data stored in distributed machines,which is also successfully applied in the construction of tile pyramid in the progress of raster data visualization.Based on the distributed computing platform called Spark,we implement the buffer analysis and spatial overlay analysis algorithms.In the process of algorithm implementation,we take full advantage of the grid structure and Spark RDD memory data model.And a variety of optimization methods regarding execution strategies and algorithm scheduling are proposed,which greatly improve the efficiency of spatial analysis algorithm.Then,according to the principle and multiple influencing factors of housing location,this paper abstracts the mathematical description of the housing location problem,and designs a housing location model which could be simplified and solved in practical application.The effectiveness of this model and the efficiency of related spatial analysis algorithms are validated in the housing location application in Changsha and Beijing area.Finally,the spatial analysis algorithms and the housing location calculation model based on distributed memory are integrated in a high performance geographic computing platform called HiGIS,which provides a friendly user interface to realize the visualization of spatial analysis results.
Keywords/Search Tags:distributed architechture, spatial analysis, spatial index, Spark, in-memory computing, Key-Value, spatial grid
PDF Full Text Request
Related items