Font Size: a A A

Online Monitoring And Prediction System On Grid Computing Resources And Tasks

Posted on:2011-12-25Degree:DoctorType:Dissertation
Country:ChinaCandidate:K TangFull Text:PDF
GTID:1118360332957340Subject:Computer system architecture
Abstract/Summary:PDF Full Text Request
The emergence of grid computing breakthroughs the traditional distributed computing schema. The differences between resources such as geography position, hardware architecture, operation system, management structure are all shielded in grid environment. Grid technology links millions of unoccupied computers distributed in Internet, and form them to a"super server"with powerful computing ability and storage ability. Therefore, all kinds of high performance computing resources, storage resources, data resources and other special equipment resources are organized dynamically to solve large scale problems in industrial productions and science researches.Grid computing is an inevitable technique of high performance computing. It has two basic objects, that is grid task and grid resource. Online monitoring and management to such two objects, is of great importance in enhancing resource usage, optimizing task scheduling, and exerting the ability of grid. Monitoring is an online reflection of grid performance states, it is the foundation of prediction. Prediction is an online reflection of grid performance variation, it is the necessary supplement to monitoring. This thesis focuses on building an online monitoring and prediction system for grid tasks and resources, we discusses key issues including system architecture, workflow design, systematical method and prototype system.In designing the monitoring and prediction system, the function of monitoring and prediction should act as an independent component, and merge seamlessly to grid computing system. Therefore, we firstly conduct a deep comparison to mainstream grid computing systems, then we find out the unique character of grid parts and their organization. Furthermore, we analysis why a grid computing system needs monitoring and prediction function, and take the reason as the breakthrough point for system design.Typical grid system are mostly build according to web service resource framework. Therefore, our design also follow the WSRF standard: all the system services run on the lever of grid service, and realize system function by dynamically collaboration of system services. The system falls into four parts according to different function, that is grid resource monitoring, grid resource prediction, grid task monitoring, grid task prediction. The four are complements to each other. For the purpose of reducing system overhead to computing nodes, an information node is deployed to proceed the management, storage and service of grid performance information. Furthermore, only inevitable services and sensors run on each system nodes. The system takes browser as user interface so that grid user does not need to setup any additional ware to access system via information service.Grid resource monitoring focuses on the performance information about resource distribution, availability, host load, etc., by means of monitoring mechanism. We compared typical resource monitoring system, reviewed mainstream monitoring methods, and analyzed their feasibility in our system. We proposed a series of design principles of resource monitoring system, and gave the overall architecture for building prototype subsystem. The function of resource monitoring is accomplished through collaboration of resource monitoring service, resource sensors, and information service. We illustrated the work flow, and explained the information format. We also recorded the overhead that the monitoring subsystem takes to the grid, including CPU usage rate and memory usage. Statistical results indicated that the overhead is very small.Grid resource prediction focuses on the variation trend and running track of resources in Grid system, and also the future resource distribution, availability, host load, etc., by means of modeling and analyzing historical monitoring data. We reviewed relative researches from the angles of structure design and methodology. We described the mathematic formulations for resource prediction problem, and explained the overall resource modeling methods in details, including data pre-treatment, modeling and after-treatment. We proposed a series of design principles of resource prediction system, and gave the overall architecture for building prototype subsystem. The function of resource prediction is accomplished through collaboration of resource monitoring service, resource prediction service, optimizing service, and information service. Benchmark dataset of host load and network bandwidth are employed to test the prediction methods. We implemented 5 models in resource prediction, including Back Propagation Neural Network, Radial Basis Function Neural Network, General Hybrid Neural Network, Epsilon-Support Vector Regression and Nu-Support Vector Regression. Comparison results on efficiency and accuracy indicated that two support vector regressions achieve lower error while costing less training time, in resource prediction of both one-step-ahead and multi-step-ahead, thus is feasible in grid resource prediction subsystem.Grid task monitoring focuses on the performance information about task distribution, execution states, actions, etc., by means of monitoring mechanism. We compared several typical task monitoring tools, summarized mainstream task monitoring methods, and analyzed their feasibility in our system. We proposed a series of design principles of task monitoring system, and gave the overall architecture for building prototype subsystem. The function of task monitoring is accomplished through collaboration of task monitoring service, task sensors, and information service. We also recorded the overhead that the task monitoring subsystem takes to the grid, including CPU usage rate and memory usage. Statistical results indicated that the prototype system has tiny overhead.Grid task prediction focuses on the running track of tasks in Grid system, and how the tasks will occupy the computing resource, as well as the execution time, etc., by means of modeling and analyzing historical monitoring data. Grid task prediction usually contains two parts, that is how the tasks will occupy the resource, and how much time a task should cost. In this paper, we cares more about the later one, namely running time prediction. We summarized and reviewed typical task prediction methods in combination of related system, described the mathematic formulations for execution time prediction problem, and explained the overall task modeling methods in details, including data pre-treatment and modeling. We proposed a series of design principles of execution time prediction system, and gave the overall architecture for building prototype subsystem. The function of resource prediction is accomplished through collaboration of task monitoring service, task prediction service, optimizing service, and information service. Historical dataset of task execution is employed to test the prediction methods, and introduce the accuracy and efficiency of time prediction as evaluation criteria. We implemented 5 models in time prediction, including Back Propagation Neural Network, Radial Basis Function Neural Network, General Hybrid Neural Network, Epsilon-Support Vector Regression and Nu-Support Vector Regression. Concerning the results on efficiency and accuracy, two support vector regressions achieve both less error and less training time, thus is feasible in grid task prediction subsystem.The results of the thesis will contribute to the complementation and advancing of Grid infrastructure. Depart from the research work of here, we expect further works to extend in the following angles: optimization to prediction model, extension of management object, and extension of application schema of information, etc. It is believed that as an important tool for high performance computing, grid computing technology disserves much more development and application research.
Keywords/Search Tags:grid resource, grid task, monitoring system, prediction system, artificial neural network, support vector machine
PDF Full Text Request
Related items