Font Size: a A A

A Framework For Content-a Ware Distributed Scheduling Of Astronomical Data Stream

Posted on:2021-10-31Degree:MasterType:Thesis
Country:ChinaCandidate:X K ZhangFull Text:PDF
GTID:2480306113454604Subject:Software engineering
Abstract/Summary:PDF Full Text Request
The development of astronomical telescope's greatly improved observation capabilities and the increasing number of observation cameras,especially the rise of short time scales and large field of view observations,have produced astronomical big data streams that must be observed and processed in real time.Large field of view astronomical observations require the simultaneous use of an array of multiple cameras to collect data.Short time scale means that during the observation process,the observation direction and angle of the camera will change in a short time,resulting in the data produced by the same camera before and after the change corresponding to different observation ranges,which need to be treated differently.Due to its universal design,the existing stream data processing platform does not judge and separate the data components in the data stream,and cannot meet the needs of task scheduling of the data stream components according to the changes of their components.At the same time,there is also little development flexibility.Problems such as large functional redundancy,inconsistent applicable scenarios,and difficulty in implementation.In response to the needs of astronomical data stream data processing and task scheduling,this paper proposes an astronomical data stream content-aware distributed scheduling framework.The scheduling framework separates the data components in the data stream,binds the data components and data sources,and assigns a task node to each data source to be responsible for the data processing of the data source.The scheduling framework monitors the changes in the content of the data streams of the separated data sources and schedules tasks according to the changes.The framework is efficient,reliable and fault-tolerant.The framework adopts a master-slave design,which is mainly composed of a master node daemon process and a slave node daemon process.The daemon process is implemented by a multi-thread mainstream waterline architecture.The framework uses docker virtualization technology to encapsulate the data processing logic.Docker can easily turn docker on and off,input and export data,and facilitate the scheduling framework to use.The reliability of the framework is guaranteed by the buffer mechanism,load balancing mechanism,data retransmission mechanism,and abandoned task processing mechanism.The scheduling framework ensures the fault tolerance of the scheduling framework through the heartbeat information sending mechanism from the node,the real-time monitoring of the Docker running status from the node,and the export of microservice error log records.It is proved in the application of GWAC camera array that the scheduling framework can meet the task scheduling requirements of GWAC about 1G / 15 s data,can support real-time astronomical observation tasks,generate star light curve online,and assist astronomers to realize the functions of camera follow-up Some astronomical phenomena with scientific significance.The scheduling framework can also be applied to real-time data processing of other scientific devices that contain multi-source and variable-source data streams.
Keywords/Search Tags:schedule framework, dataflow, Astronomical big data
PDF Full Text Request
Related items