Font Size: a A A

Fault Diagnosis And Management Of Monitor Platform For Distributed Cluster

Posted on:2016-03-09Degree:MasterType:Thesis
Country:ChinaCandidate:X ZhangFull Text:PDF
GTID:2298330467992514Subject:Electronics and Communications Engineering
Abstract/Summary:PDF Full Text Request
With the rapid development of network technology, the distributed technology has become a development trendency for the computer technology, which greatly improve the development of high performance computing architecture, it has been applied in many field such as science research, geography and so on. As the the distributed cluster is a important part of distributed technology, there are many software and hardware resource need to be monitored in read-time to make sure cluster achieve load banlancing and complete scheduling work successfully, and manage and solve the failure of cluster to ensure the stability of the system.Firstly, this thesis introduced the background and value of this research, combined with research level of monitoring and management system in word at the moment, gived out weakness of these systems and the target of our research. Secondly, we elaborated the distributed technology and monitoring technology for cluster, and provided a summary description of the data transmission mode in monitoring systems. Then, we put forword some key problems in the implementation of the system in this paper and solutions to these problems. Moreover, this thesis focused on design and implementation process of the monitoring and management system for distributed cluster, which is based on Flume and Zookeeper. And then accoding to the system’s functional components, we analized some main modules in the system, including fault monitoring module, fault alarm module and management module, and explained design and implementation process of these modules. Finally, we tested and evaluated the system, and summarized shortage of the system and some improvement methods, posed the next research direction according to the existing problem in the research.
Keywords/Search Tags:distributed cluster, fault monitoring and alarm, management for cluster, Flume, Zookeeper
PDF Full Text Request
Related items