Font Size: a A A

Research On Massive Data Distributed Processing And Control For Astronomy

Posted on:2018-12-01Degree:DoctorType:Dissertation
Country:ChinaCandidate:S L WeiFull Text:PDF
GTID:1310330536986143Subject:Astronomical technology and methods
Abstract/Summary:PDF Full Text Request
Currently,with the advance of manufacture technology and the application of astronomical instruments with high resolution,modern large telescopes are faced with the challenges of massive data processing.Mingant U Sp Ectral Radioheliograph(MUSER)is a synthetic aperture radio interferometer.As a solar-dedicated interferometric array,MUSER is capable of producing high quality radio images with high temporal,spatial,and spectral resolution.If normal observating time is 8 hours per day,the amount of raw data is about 2.6T bytes.In contrast to tranditional historical data processing,MUSER is faced with the requirements of real-time and batching data processing.In the context of massive data processing and management in MUSER,traditional single host with multi-thread or multi-core parallel computing technology perform many limitations.The current mainstream open source distributed computing frameworks,such as Hadoop,because of dependencies on specific storage,complex programming interface,exclusive data object and mode,are not well suited for astronomical data processing.Therefore,it is an urgent need to develop a new platform/framework to accelerate astronomical data processing.The platform/framework has to provide high-speed,scalability and easy programming interface,and can be dynamically tuned to demand of MUSER.The New Vacuum Solar Telescope(NVST)has began routine observation.Due to the new needs of observation,the new terminal devices have been added to NVST.But these additional terminals are operated independently.There is lack of unified control and observation scheduling.It causes that the current observational procedure relies heavily on artificial operation.In order to play a greater role of NVST,it is nessecary to improve automation and informationization of NVST,and integrate current subsystems to realize automatic observations.Therefore,this dissertation takes distributed data processing technology as the core focus,and studies design of distributed computing framework,the concrete applications of distributed computing in MUSER and design of network communication model for telescope observation control system based on Zero MQ.This dissertation mainly includes the following aspects:1)Applied research on real-time data processing of MUSER based on Spark Streaming.A customized receiver was created for real-time binary stream of MUSER.A customized partition and other ways was used to optimize real-time processing performance.Asynchronous execution was adopted to improve processing stability.2)Design and implementation of a distributed computing framework for astronomical data processing.We design a distributed computing framework called Open Cluster,by Python which is widely used in astronomy.Open Cluster provides simple programming interfaces that facilitate astronomers to extend existing codes into distributed applications quickly and easily.A heartbeat mechanism was implemented for node failure check in Open Cluster.A lightweight courtesy method was adopted in leader selection of factories to achive high availability.Open Cluster was applied to MUSER real-time and batch data processing successfully.This dissertation also present the usages and design of web interface for MUSER data processing application.3)Research on cluster resource scheduling.On account of shortage standalone scheduling model in Open Cluster,we implemented one framework per job and centralized storage scheduling mode which solve the problem of resource isolation and sharing,priority scheduling for multiple computing frameworks in the cluster.4)Based on Docker to build an astronomical lightweight private cloud environment.To improve reliability of long-running services in MUSER,we used the combination of "Mesos + Marathon + Docker" to dispatch containers.We also deployed an environment with Kubernetes to create reliable long-running services in MUSER.5)Design of network communication model for telescope observation control system based on Zero MQ.Firstly,we analyzed the limitation of network communication in RTS2 which is an open source astronomical telescope control system.Secondly,applicability of MQTT in telescope control system is also discussed in term of the similarity of device control between Internet of Thing and telescope control system.Finnally,we presented design of network communication based on Zero MQ.In this dissertation,distributed processing technology and lightweight container cloud based docker are used to solve the problem of real-time,historical data processing and reliable services deployment in MUSER.The design of network communication model based on Zero MQ has laid a good foundation for the realization of future observation control system.The research methods also provides a reference for massive distributed data processing of similar telescopes,and the design of telescope observation control system.
Keywords/Search Tags:Massive data, radio data processing, distributed processing, resource dispatching, observation control
PDF Full Text Request
Related items