Font Size: a A A

Key Techniques Of Mass Geographic Raster Data Storage

Posted on:2014-02-11Degree:MasterType:Thesis
Country:ChinaCandidate:J LiuFull Text:PDF
GTID:2308330479479396Subject:Software engineering
Abstract/Summary:PDF Full Text Request
With the development of the satellite, aerial mapping and remote sensing technology, getting geographical image information is becoming much easier, and the global image data set scale expands rapidly. The resulting benefit is that geographic information system has the ability to show every corner of the world high precision map. But such vast amounts of geographic information data also brought information system research and the construction great challenges. Massive geographical raster data storage and access is one of these challenges. For example, raster files comes from 18 layers of the pyramid will reach scale of twenty billion, data volume reached more than PB. And the geographic information system needs to provide online real-time data services, makes the backend storage system file access latency态concurrent access ability and fault tolerance requirements are high. At present, the open source mass file system and the storage system does not have such massive storage and low latency access capability.According to the requirements mentioned above, the SMDFS distributed file system has been accomplished, which is based on HDFS. in SMDFS, files in same directory was packaged into one data file, and metadata management was designed to the structure of two levels, the first level managed large files just as HDFS did, the second level manage small files which are inside of the large files. The second level metadata is scattered in data nodes, so that the problem of mass small file storage and low latency access was solved. SMDFS is mainly suitable for each directory contained lots of small files. However the geographic raster data is constituted to be quads(pyramids),A pyramid represents a geographic area,and usual form is multilayer quad directory. In addition to the leaf node, each directory has four subdirectories and a small amount of raster images. So SMDFS is obviously not suitable for these pyramid organizational raster images, because amount of metadata did not really reduced, the storage capacity and access efficiency is very low.Aimed at the problems caused by mass raster files storage and access, paper propose the concept of polymerization space. Polymerization space is a file polymerization unit. Files in same polymerization space are packaged into one data file, metadata server maintenance polymerization index information, and the data server maintenance polymerization small file index information. In this distributed file system, polymerization spaces are organized into tree structure. A polymerization space contains multiple child polymerization space. So when there are mass files storing, according to the file directory structure, we select or design a good mapping algorithms to map these files into a new polymerization space structure, in this way, each polymerization space contains a large number of small files, and the efficiency of storage and access will be improved.In order to achieve massive pyramid raster data storage and real-time access, localized priority folded storage method was designed, which maps the pyramid files directory structure into file polymerization space structure. Localized priority folded storage method and technology maps the n layer pyramid file directory structure into the ?n2/? layer pyramid file polymerization spaces structure. While files in same layer directory shared the same ancestor directory, they will be mapped into the same polymerization space, which achieved high efficiency polymerization storage by that adjacent raster file storage is also adjacent. Through localized priority folded storage method, files in one polymerization space compared to the original one directory file increased ?n/2?4 times on mount that improved the performance of storage and access greatly.This paper is based on SMDFS, researched and designed pyramid object, transform the pyramids model to pyramid objects model, and takes pyramid object as a unit for raster files storage and management. At last we accomplish polymerization-space-based small files distributed storage system. A lot of tests show that polymeric-space-based small file system can efficiently achieve massive quad raster data storage and fast real-time access, and compatible with the mass of small files and large files storage, meet global GIS storage and online services requirements.
Keywords/Search Tags:Small files distributed storage system, Quadtree, Polymerization storage, Polymerization storage space, Folded polymerization
PDF Full Text Request
Related items