Font Size: a A A

Research On Storage Layout Optimization Of Long-term Astronomical Data Archive

Posted on:2020-02-22Degree:MasterType:Thesis
Country:ChinaCandidate:Z LiFull Text:PDF
GTID:2518306518462974Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the continuous improvement of astronomical observation capabilities,the amount of data acquired by astronomical observations has exploded,and a large number of excellent astronomical research results are based on these massive historical observations.Therefore,a long-term astronomical observation data archiving system is of great significance for astronomical research.However,these massive astronomical observations bring insignificant energy and performance issues to the archiving system.The optimization of storage layout is an effective way to reduce energy consumption and improve performance.Since the archiving system data request is often directed to a certain area on the celestial sphere,the temporally aggregated storage layout of the observation site is often expensive in terms of energy consumption and performance query.Using a spatial aggregating storage layout is an effective way to optimize the energy consumption and performance of your archiving system.However,how to aggregate the observation data of nearby sky areas into one storage device while maintaining high storage capacity utilization and load balancing of different storage devices is a challenging problem.To solve this problem,this thesis developed a storage layout conversion tool Astro Layout for the archive system.When adding the astronomical data generated by the observation site to the archiving system,Astro Layout can generate a new spatially aggregated storage layout called GpDL suitable for the archiving system,and complete the data migration from the source storage layout to the GpDL storage layout.GpDL innovatively introduces the graph partitioning method into the generation of spatially aggregated layout.The HEALPix is used to initially divide the celestial sphere into finegrained cells.Then the adjacent cells are aggregated into several sub-regions by using the graph partitioning method according to the density of the data distribution,and the load balancing of the data volume of different sub-regions is realized.In the process of data migration,Astro Layout is compatible with a variety of target storage devices such as hard disks,optical disks and tapes.It also provides functions such as breakpoint retransmission,timeout detection,and file verification to enhance tool availability.Experiments show that GpDL saves up to 91% of storage capacity utilization while saving a lot of resources for the archiving system.Compared to Ta DL(temporally aggregated storage layout on observation site),Amr DL(a spatially aggregated storage layout based on adaptive mesh refinement)and Srp DL(a spatially aggregated storage layout based on Spark Range Partitioner),GpDL effectively reduces energy costs and query time under the same data request.
Keywords/Search Tags:Astronomical Observation, Data Archive, Storage Layout, Spatially Aggregated
PDF Full Text Request
Related items