Font Size: a A A

Design And Implementation Of Hadoop Cluster Monitoring System

Posted on:2014-01-21Degree:MasterType:Thesis
Country:ChinaCandidate:Y H XuFull Text:PDF
GTID:2268330422451986Subject:Software engineering
Abstract/Summary:PDF Full Text Request
In recent years, since the rapidly developing of cloud computing, widelyapplying of distributed cluster system, to develop cloud computing technology hasbecome the core strategy selection to seize the commanding heights of economicand technological development for countries. It is the mainstream for governmentinformation technology in the next decade to research and develop cloudcomputing technology, shift from the existing information technologyinfrastructure to cloud computing platform. Since Hadoop cluster is composed bynumerous computer clusters, it becomes a major difficulty to monitor and managethe computer cluster. To be able to detect problems in the cluster, and takeappropriate measures in short time, Hadoop cluster administrators need toconstantly collect performance data, analyze the state of each node, and find outexactly what’s the problem, which requires Hadoop cluster monitoring system forassistance. Monitoring data storage will become a big problem because of such alarge amount of data from Hadoop cluster. To deal with this, it becomes onerequirement to research and combine a distributed system and non-relationaldatabases.This paper mainly designs and completes the development of a Hadoop clustermonitoring system, which monitors performance of the nodes in Hadoop clustersystem. What’s more, it can also monitor distributed file system and job task. Inorder to facilitate Hadoop cluster administrators to manage cluster nodes inreal-time, the system will display the related real-time monitoring information inweb page. Monitoring system will use cacti for node performance collection. Sinceall the existing Hadoop monitoring systems are not able to store comprehensivemonitoring data, the choice of database has been studied and demonstrate thatnon-relational databases is the top choice for Hadoop Distributed MonitoringSystems. Finally it realized to carry out relevant functions and performance testing,which is required to use MongoDB to store data and historical inquiry for Hadoopmonitoring system.
Keywords/Search Tags:clusters, monitor, database, Hadoop
PDF Full Text Request
Related items