Font Size: a A A

Research On Distributed Storage And Schedule Method For Spatial Big Data Of Urban Underground Space

Posted on:2023-04-29Degree:DoctorType:Dissertation
Country:ChinaCandidate:Z P LiuFull Text:PDF
GTID:1520307148984729Subject:Surveying the science and technology
Abstract/Summary:PDF Full Text Request
With the continuous development of urbanization,the contradiction between the supply and demand of urban construction land is becoming increasingly serious.Underground space development has become an important part of modern urban construction.The development and utilization of urban underground space will turn to active and deep exploration in the future,showing the trend of underground information transparency,intelligent operation,and maintenance management.The fine detection of urban underground space generates a large amount of geological data with spatial attributes.Unlike traditional spatial data,it has the characteristics of large amounts,complex relationships,and diverse organization forms,which brings higher requirements for spatial data management.Related research on parallel computing reduces the difficulty of spatial big data management and analysis through the encapsulation of the underlying details and parallel computing model.However,the characteristics of underground spatial data have a great impact on the efficiency of parallel analysis.The general spatial analysis method is easily affected by the problems of data locality and load balancing.With the rapid growth of underground space data,optimization of distributed storage and analysis for specific spatial characteristics and analysis characteristics will have an increasing impact on the efficiency of parallel analysis.This dissertation mainly considers the urban underground space exploration data and the volume data model derives from the exploration data,focusing on the distributed storage,parallel analysis,and multi-task scheduling methods of urban underground space data.The main content of the dissertation is as follows:(1)The exploration of urban underground space produces a large number of spatial files with different distributions,magnitudes,and organization forms.The general parallel computing technology is hard to efficiently support the underground space parallel analysis with different data access modes.This dissertation proposes a hierarchical storage method for underground space exploration data considering different access modes,and optimizes the storage levels for different parallel analysis access modes.A storage model for the heterogeneous and multi-source characteristics is designed,and the spatial partition and index algorithm based on the multi-scale and distributed data block characteristics of the storage model is researched,which is applied to the hierarchical storage of distributed spatial index,file merge organization,and file collaborative storage.Based on the storage,the distributed spatial index is employed to optimize the construction of parallel analysis tasks.Experiments showed that the hierarchical storage method could improve the parallel analysis efficiency under different data access modes in a distributed environment.(2)Volume data is one of the most commonly used data organization formats in geological modeling derived from exploration data.The parallel analysis of volume data under the general distributed storage mechanism encounters the network transmission overhead caused by the data locality problem,which has a great impact on the efficiency of parallel tasks and the cluster.In this dissertation,we propose a volume data replica placement method that takes into account the neighborhood correlation.The method uses multiple replicas in the distributed file system to design different neighborhood storage modes,and optimizes the data locality and load balance in storage and computing.Based on the optimization,a general volume data parallel analysis model is designed,which enables the traditional volume data analysis algorithm to be efficiently extended to the distributed environment.The experimental results showed that the method could significantly improve the parallel analysis efficiency of volume data while maintaining the distributed storage reliability and write efficiency.(3)Hierarchical storage and replica placement methods of underground space exploration data and volume data enable parallel analysis of underground space to be executed efficiently when the computing resources are sufficient,however,resource preemption between tasks reduces the efficiency under multi-task execution.This dissertation focuses on the data locality and load balancing problems in the underground space data storage method,and studies the underground space parallel task scheduling method based on an improved genetic algorithm.According to the analysis process and spatial characteristics of parallel analysis tasks,an estimation model of execution resource cost and running time is constructed.Taking advantage of the storage optimization of exploration data and volume data,the total execution time and load balancing parameters under multi-task scheduling are set as the iterative optimization objectives,and the genetic algorithm operations are realized.The impact of the main parameters of the genetic algorithm on the scheduling results is discussed.The experiments showed that this method could significantly improve the scheduling efficiency of underground space parallel analysis tasks.The innovation of this dissertation is mainly in three aspects:(1)A hierarchical storage method for underground space exploration data considering different scales and file correlation access features is proposed.Different storage levels can be employed to improve the analysis efficiency under the corresponding access mode;(2)A volume data replica placement strategy taking advantage of the multi-replica storage mechanism to construct different neighborhood storage modes is proposed,which can reduce the impact of the cluster performance caused by poor data locality;(3)A task scheduling algorithm considering spatial characteristics and distributed storage characteristics is proposed,which can improve the scheduling efficiency of the underground space parallel analysis task through the improved genetic algorithm.Considering the storage and analysis characteristics of spatial big data in urban underground space,this dissertation studies the efficient analysis methods of underground space exploration data and volume data from three aspects: storage model,index and storage optimization method,and parallel analysis model.On this basis,the scheduling method for the underground space analysis task is further explored.The dissertation promotes the practice of spatial big data analysis of urban underground space in the distributed environment,and provides research ideas for the management of underground spatial data and other similar spatial big data application scenarios,which has important scientific significance and social value.
Keywords/Search Tags:urban underground space, spatial parallel computing, spatial hierarchical storage, neighborhood correlation, volume model, spatial analysis task schedule
PDF Full Text Request
Related items