Font Size: a A A

Design And Implementation Of Cluster Monitoring Platform

Posted on:2020-02-24Degree:MasterType:Thesis
Country:ChinaCandidate:D P ShengFull Text:PDF
GTID:2428330575453074Subject:Engineering
Abstract/Summary:PDF Full Text Request
With the rapid development of high-performance computing and related fields,real-time monitoring of clusters has become the most important part of the operation and maintenance.Ensuring the security and stability of the cluster is the cornerstone of the cluster operation,and it is also a necessary condition for the successful development of high-performance computing services.For clusters,the high cost or bundled hardware sales of the monitoring system greatly limits the development of the cluster space.Therefore,building a set of open source controllable,stable and efficient,and visual server monitoring platform is of great significance for improving the efficiency of the cluster,improving the operation and maintenance level,reducing the operation and maintenance costs,and ensuring the safe operation of the cluster.Studies have shown that the execution time of computing tasks is directly related to CPU load.Therefore,this paper conducts a predictive model research on the CPU load data collected by the monitoring platform,and adds the model to the monitoring platform.The advanced prediction of resource status in the cluster can provide important support for the operation and maintenance personnel.The main work of this paper includes the following aspects: Based on the analysis of the mainstream open source monitoring system solution,the cluster monitoring platform based on Zabbix is designed and implemented.Mainly in the monitoring platform data storage,abnormal alarm,visual display and other aspects of the design and research,give optimization and solutions.On the basis of realizing the monitoring of various indicators of the cluster server,modify the database structure of the monitoring platform,optimize the database by partitioning and creating a stored procedure,improve the performance of the platform,and propose a table partition data amount estimation formula;The platform realizes the abnormal alarm information of multiple media such as mail and WeChat;combined with the visualization tool to carry out secondary development of the monitoring platform page,expands the main page of the existing monitoring platform,realizes clearer and more intuitive monitoring information display;studies CPU load prediction technology,Based on the time series prediction method,the absolute error and the objective function of the combined model are proposed.The autoregressive and exponential smooth combination forecasting model is constructed with the minimum objective function value,and the prediction algorithm is implemented in the cluster monitoring platform.After the implementation of the system,the test shows that the system performs well,can accurately monitor the performance indicators of the cluster,and achieve database performance improvement,alarm media expansion,and visual development.At the same time,based on the collected real cluster load data,the algorithm is verified and verified.By comparing the actual values and predicted values of different data sets,the accuracy and effectiveness of the prediction algorithm are verified.
Keywords/Search Tags:Zabbix, cluster monitoring, time series, load forecasting
PDF Full Text Request
Related items