Font Size: a A A

Construction And Application Of LAMOST Scientific Cloud Computation Platform

Posted on:2014-01-13Degree:MasterType:Thesis
Country:ChinaCandidate:X X WangFull Text:PDF
GTID:2248330398959388Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Along with the development of the detector and space technology, the astronomical observations have expanded to each band of electromagnetic including infrared, ultraviolet, X rays and gamma ray from visible light, radio band, and that formed a barrage astronomy, so that we have reached a new stage that named of all-band large-sample huge-amount of information times.Astronomy has become the subject of leading enterprise with mass data, due to the astronomical data’s huge quantity and rapid growth rate, the data generated from patrol project can usually reach TB and even PB level.For example, the Sloan digital patrol SDSS. costing ten years to cover8000square degrees of sky, and get about a number of forty TB data of about108star or galaxy.With LAMOST Sky Survey being carried out to complete the stellar spectra observation of10million galaxy,1million quasars and10million star, the observation data will be generated as much as ten times that of SDSS. which will make a great challenge for the storage and processing. According to the needs of the LAMOST,this paper constructed a set of scientific computing platform suitable for astronomical data processing for the storage and processing of the mass spectral data,and designed and implemented the customizable cloud storage system.The main research jobs are as follows:1、Constructed a set of24servers in the LAMOST data processing center for research and scientific computing platform for astronomical data processing, which is based on Hadoop open source framework including NumPy, SciPy, PyFITS Kit. For easy to delete or add physical nodes and load balancing we also write a set ol’ automatic deployment package using Python and Shell.2、The multi-user cloud storage system provide users with functions ol creating new folders, uploading or downloading files/folders, deleting files/folders, recycle bin, notepad and personal information management, which is realized based on the Hadoop core components L1DFS. In addition, the Administrator could manage the account (such as add, modify, quotas or delete operation) and the unit and query the system information. It is also convenient for users to store their information such as related data and processing result.3、Studying the MapReduce programming model of scientific computing platform core component.Based on the present perfect template matching algorithm,using MapReduce thought to complete template matching for three hundred and thirty-eight DR7spectral data and three hundred and sixteen DR8spectral data, it mainly used the KNN and chi-square minimization algorithm to test the data abovcand made a contrast and analysis of the results of the expriment in single and cluster environment.
Keywords/Search Tags:Hadoop, Cloud Storage, Pydoop, MapReduce, Template Matching
PDF Full Text Request
Related items