Font Size: a A A

Research On Method Of Distributed Search Of Temporal?spatial Big Data Based On Hadoop

Posted on:2018-04-13Degree:MasterType:Thesis
Country:ChinaCandidate:G Y WangFull Text:PDF
GTID:2348330515959911Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
With the space task becomes more and more complicated and frequent,data level generated by the scientific satellite has presented exponential growth trend and a large amount of data will be generated in the orbit of a scientific satellite.These massive data has characteristics of variety,heterogeneous,real?time,massive and distributed storage and encounters the challenges of computing ability,storage system and communication speed.The developed traditional data management system based on the relational database management system and the file system cannot support for storing massive structured data and are unable to cope with the high concurrency and high scalability challenges.That's why it is necessary to use the new method to manage them effectively.The traditional organization of space science data is usually performed by means of space partition.It is based on traditional relational database system or file system and based on space partition grid.The spatial data is encoded according to the spatial region.Through the retrieval of the code to complete the data query process.However,because of the data organization is based on the traditional relational database,the ability to store massive structured data is insufficient.Hadoop is a distributed system framework to deal with massive data,which shows great advantages in supporting large?scale data.However,because Hadoop is developed large data processing framework based on one?dimensional unstructured or semi?structured data at first,it can be directly used for structured space science big data organization and management.According to the problem of traditional data management system cannot store massive data and the Hadoop cannot be directly used for structured space science data organization,this paper proposes an algorithm of distributed region of space science big data based on Hadoop to support the quick retrieval of data and test and analyze the performance with multiple sets of data.The main contents of this paper are as follows:First of all,the current domestic and foreign research results of two aspects are introduced systematically,including indexing methods of spatial?temporal data and organization methods of two?dimensional space science data.The components of Hadoop including the working mechanism of HDFS,MapReduce and Hive are also studied.It provides a theoretical basis for later study.Secondly,the index method of spatial?temporal data is designed based on the Hadoop architecture,including data source index,time index and two level spatial index.Among them,the two level spatial index includes the spatial global index used to block query from the nodes and the spatial local index used to data query in data block.The method of building data source index and time index based on Hive components and the method of Block?Grid partition based on cube are proposed.The algorithm of data retrieval in distributed environment is designed.Third,the distribution strategy of data source index information,time index information and spatial index information in a distributed system architecture Hadoop as well as the retrieval process of data query operation are designed.This paper proposes a method for computing the grid sequence of the covering region of the target query region.This method can effectively improve the efficiency of data retrieval.Fourth,NSSC?Hadoop distributed system architecture to deal with structured space science data is designed based on Hadoop infrastructure.The overall structure of the system,the design of the distributed cluster,the process of the cluster configuration are introduced in detail.Several experiments are carried out to verify the algorithm and the experimental results are analyzed.Finally,the research work is summarized and further prospected.Finally,the research work is summarized and further prospected...
Keywords/Search Tags:spatial temporal big data, Hadoop, distributed search method
PDF Full Text Request
Related items