Font Size: a A A

The Design And Implementation Of Multi-stars Storage And Cross-match Based On Hadoop

Posted on:2015-03-16Degree:MasterType:Thesis
Country:ChinaCandidate:X X ZhangFull Text:PDF
GTID:2268330431456279Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
With the development of science and technology, astronomy has begun the full-band age of sky survey, the data of each band increasing rapidly. Along with LAMOST began formal surveys, massive spectral data will be gradually released during the surveys. At the same time, catalog data of other countries around the world have continued to publish, such as WISE(Wide-field Infrared Survey Explorer), FIRST(Faint Images of the Radio Sky at Twenty-Centimeters), Pan-STARRS(Panoramic Survey Telescope&Rapid Response System), SDSS(Sloan Digital Sky Survey),2MASS(Two Micron All Sky Survey) and so on. As performance of various survey telescopes is different, which resulting in different error radius and band information contained in catalog data, so that the astrophysical information of different catalogs is different. In order to obtain more comprehensive and systematic information or celestial bodies, celestial information of multi-band needs to be aggregated, which is also the foundation of statistical analysis, data mining in the future.The key to deal with mass of astronomical catalog data is the efficient of storage and cross-match of catalogs. It’s necessary to use distributed and parallel computing techniques for large data processing to resolve the massive astronomical data processing problem. In this paper, processing of massive astronomical data based on Hadoop has been studied. The main work is divided into the following three parts:1. Building an effective data storage for different catalogs with HBase component of Hadoop, to improve disk utilization and star clusters table information query efficiency.2. On the basis of pseudo two dimensional spherical index method of HEALPix and HTM, realized efficient cross-identification across multiple catalogs combined with the Hadoop.3. The cross-identification results are stored in Hadoop, the user can download the recognized results of cross-certificates and download the query results based on specific information.In this paper, we realized the storage of massive astronomical data and cross-identification across multiple catalogs combined with the Hadoop, which effectively improves the efficiency of storage and cross-identification catalog of data, and has important reference value to solve the massive astronomical data similar in the future.
Keywords/Search Tags:mass of astronomical catalog data, Hadoop, HBase, Cross-match
PDF Full Text Request
Related items