Font Size: a A A

The Design And Implementation Of Cluster Management Subsystem In CAT Distributed Monitoring System

Posted on:2020-02-18Degree:MasterType:Thesis
Country:ChinaCandidate:Q Q HanFull Text:PDF
GTID:2428330575455040Subject:Engineering
Abstract/Summary:PDF Full Text Request
With the rapid development of the Internet industry,the system architecture continues to evolve.Monitoring system has become the basic component of the distributed system architecture,which plays an important role.Monitoring system is used to monitor the real-time running status of the system.In the internal scene of Meituan,the distributed monitoring system of CAT is capable of withstanding the throughput of GB level per second for more than 10,000 applications and more than 80,000 machines in the upstream.In order to cope with such throughput,at present,in the production environment,there are currently more than 100 physical machines that form an indicator processing cluster to provide bandwidth and computing power.However,large-scale clusters have exposed many problems during the actual operation.For example,the status of the indicator processing cluster itself is difficult to observe,and the nodes in the cluster are difficult to perform macro-control.There is a big difference in the traffic between nodes in the cluster.In order to solve the above problems,this thesis designs and implements the cluster management subsystem.The subsystem plays the role of Master in the distributed monitoring system of CAT.It is mainly used for the maintenance and regulation of routing information in the cluster,the management of indicator processing nodes and the load balancing of the system.The subsystem establishes a connection and interacts with the indicator processing node through the Netty component.In addition,the subsystem will also perform statistical analysis on the status of the indicator processing cluster itself,in which cluster-related indicator data will be transmitted through the Kafka message queue and eventually persisted to the Elastic Search database.The cluster management subsystem can be divided into four modules:indicator statistics,routing management,node management and load balancing.Among these modules,the indicator statistics module will analyze the indicator data of the indicator processing node itself and provide rich indicator query functions.The routing management module is responsible for maintaining the routing information in the distributed system and providing the routing management function.The node management module is responsible for the management and monitoring of the nodes and applications.The load balancing module is the core module of this system.It can make intelligent decision through strategy and algorithm model,and autonomously coordinates the traffic and load conditions between nodes in the cluster.At present,the subsystem has been put into use in the production environment.It relieves a lot of pressure for the management's usual operation and maintenance work.At the same time,it alleviates the phenomena of hot spot flow and unbalanced load among nodes in distributed system,and improves the overall processing capability and stability of the distributed system.
Keywords/Search Tags:Monitor System, Distributed, Cluster Management, Load Balance
PDF Full Text Request
Related items