The Research And Implementation Of Hadoop Cluster Monitoring System Based On Ganglia

Posted on:2018-06-24

Degree:Master

Type:Thesis

Country:China

Candidate:X F Wei

Full Text:PDF

GTID:2348330518999380

Subject:Engineering

Abstract/Summary:

PDF Full Text Request

AS an open source distributed computing framework,Hadoop is widly used in fields of big data analysising and processing.In the period of rapid development,the datas Hadoop handles are more and more large and complex,if the cluster fails or even crashs,it will cause irreparable damage,therefore,the stability of Hadoop is concerned.In order to ensure the stability and robustness of a working cluster,it is very important to develop an integrated and completed cluster monitoring system.Based on the open source monitoring frameworks of Ganglia and Nagios,this paper completed the cluster monitoring system,which contains a variety of functions of collecting monitoring data,storaging monitoring data persistently,displaying monidotring data and the alarm mechanism based on threshold.The collecting of data will be done by those open source monitoring frameworks,then the data successfully collected will be storage into data warehouse persistently by the ETL tool,which will facilitate long term storage and diverse fast queries.The function of displaying data is mainly responsible for displaying the real-time and historical monitoring data in the front page using the technology of Java Web,and the alarm mechanism will obtain the change rule of data by statistical analysising all historical data,at the same time,when the values of monitored metrics exceed the threshold limit,related alarm information will be sent out by this mechanism,which will help troubleshoot the problem and optimize the cluster.In the period of development,the data warehouse is reasonably designed by the segmentation methods of table and partition.Monitoring datas from different hosts and different years will be distinguished,and datas from different months will be divided into different blocks.The good scalability of data warehouse is not only benefical to store large mounts of monitoring data,but also conducive to query various forms of historical data quickly.In the process of statistical analysising large amounts of historical datas,a design scheme is used to assign tasks to each day,which can greatly simplifie the ststistical calculation process and rapidly reflect the change rule of historical datas.Finally,the threshold calculation scheme based on Confidence interval provides strong support for setting appropriate threshold.After a series of test verification,the cluster monitoring system in this paper is proved to be correct and robust sufficiently,which is competent fot the task of monitoring cluster overall,and is able to achieve the aim of ensuring the stability and robustness of a working cluster.At last,this paper summarized the research process,analysised the merits and demerits of this system,and put forward a series of constructive suggestions for the expansion and perfection of alarm mechanism above the threshold mechanism,including real time monitoring of log information,dynamic prediction of monitoring data,automatic troubleshooting for cluster failures,and so on.These further studies will make the system more intelligent and be applied to a wider range of areas.

Keywords/Search Tags:

Hadoop, Ganglia, Nagios, ETL, Confidence interval

PDF Full Text Request

Related items

1	Hadoop Cluster Monitoring System Based On Ganglia
2	Real-time Performance Monitoring And I/O Performance Optimization Research On Hadoop Cluster
3	The Research And Application Of Hadoop Cluster Monitoring System
4	The Research And I Mplementation Of Clusters Monitoring System Based On Android
5	Research On Estimation And Prediction Of Confidence Interval For Web Service Composition QoS Based On Bootstrap
6	Research And Application Of Hadoop Job Scheduling
7	The Research On Age Estimation For Facial Images
8	The Methods And Applied Technology Research Of Data Fusion
9	Research And Implementation Of Cloud Monitoring System Based On Ganglia
10	Research On Confidence Machine Learning Methods Based On Controllability