Font Size: a A A

High-performance Cluster Monitoring Data Analysis Based On Visual Technology

Posted on:2022-12-14Degree:MasterType:Thesis
Country:ChinaCandidate:Y K TangFull Text:PDF
GTID:2518306758474624Subject:Computer Software and Application of Computer
Abstract/Summary:PDF Full Text Request
High-performance computing system is a complex system composed of a large number of computing nodes and storage nodes.The detailed execution of a job on a cluster system is in a black box.Details about system resources such as CPU,network,and I/O used by jobs cannot be perceived by users or cluster administrators.Users and administrators cannot discover problems on time when a job has inefficient running modes,such as load imbalance and I/O bottlenecks,which seriously hinder the maximum performance of the cluster.Therefore,it is very important to provide an effective method to judge the job load balancing.Based on the comprehensive analysis of high-performance cluster data,an integrated load scoring algorithm is proposed to find the computing nodes with performance problems.The algorithm can identify performance anomalies of different dimensions of performance metrics on nodes.Meanwhile,the visualization diagram of job load balance based on force-oriented layout,time area diagram of node health statistics,and river diagram of I/O fluctuation law is designed to facilitate the system operation.It has been proved that our visual analysis system can improve the efficiency of identifying load imbalancing jobs and slow computing nodes,which improves the job performance and system resilience.This paper introduces the research and development of high-performance cluster visualization at home and abroad,summarizes the problems of supercomputer cluster load and I/O performance,and also introduces the challenges of processing a large number of high-performance cluster data and visual analysis.Facing these problems and challenges,this paper proposes a job load balancing evaluation algorithm to evaluate the load unevenness of the nodes used by the job.In order to find some abnormal nodes more effectively,this paper also designs a node comprehensive load balancing scoring algorithm,which is proved by experiments.Finally,according to the cluster data analysis results,this paper designs a variety of visual analysis systems for exploring large-scale real-time I/O performance data,uses the edge binding algorithm for visual layout,and puts forward the clustering algorithm for the sudden problem of I/O space.To sum up,the contributions of this paper in the following aspects are refined:1)Visual analysis framework for large-scale cluster monitoring data.At present,the efficient processing capacity of the cluster to the job needs to be improved.The job runs in the black box state in the cluster,and the problems encountered cannot be perceived.Monitoring data visualization is a new method for performance analysis,which solves the black box problem of detailed operation process on cluster system.At the same time,the framework of cluster monitoring data system is constructed to solve the problems of data collection delay,limited scale and data persistence of open source monitoring system.The visualization plan adopts B/S system architecture mode to carry out visual analysis of multi-dimensional monitoring data of tens of thousands of nodes in the cluster.The browser provides visual result display and interactive interface,and the server is responsible for drawing and calculation.It is necessary to study efficient real-time cluster data capture,data preprocessing,model building,improve rendering effect,reduce interaction delay,realize visual framework,and propose visual analysis methods for performance problems found.2)There are different types of work load,through hundreds of dimensional data access nodes in some dimensions such as CPU load,network information,to comprehensive characterization parameters such as node load and energy consumption of super large-scale multidimensional data analysis work load evaluation method is proposed,and presents a visual model based on force oriented graph layout design.Another method is to analyze the nodes with relatively uneven load in an operation,analyze and study the nodes with low performance caused by some resource loss and other reasons.At the same time,a node comprehensive load balancing scoring algorithm is proposed to find the problem nodes.Finally,visual analysis is conducted to help the management personnel to carry out equipment maintenance.3)By analyzing and predicting abnormal jobs and exploring the cause of the problem,it is found that a large number of jobs and I/ OS of the same storage node are prone to I/O competition.Therefore,you need to analyze the size and quantity of I/ OS of storage nodes and evaluate the size of the competition.If job I/O is scattered to too many storage nodes,it has high parallelism but will compete with more jobs for storage node resources,which will have adverse effects on the operation of other jobs.Visualization technology is used to a lot of homework and visual analysis of I/O storage nodes,and designed a clustering hierarchy visual image,according to visual image visual barrier problem,using the edge binding algorithm optimization,in addition to face super I/O space problems in the computer,on the visualization methods based on clustering algorithms are analyzed.
Keywords/Search Tags:High-performance cluster, Load balancing, I/O performance, Performance visualization
PDF Full Text Request
Related items