Font Size: a A A

Design And Implementation Of Server Monitoring System Based On Machine Learning

Posted on:2022-10-27Degree:MasterType:Thesis
Country:ChinaCandidate:Y F XiongFull Text:PDF
GTID:2518306524971699Subject:Master of Engineering
Abstract/Summary:PDF Full Text Request
Big data and cloud computing technology are widely used in various industries.Since big data and cloud platforms are running on top of basic hardware,the stable operation of servers will directly affect the stable operation of various applications on the cloud platform,and have a great impact on the user experience.Based on the customized cluster server room failure monitoring requirements,this subject applies the server failure detection algorithm framework based on the machine learning algorithm theory,and performs related tests on the algorithm.The main research contents of this thesis are as follows:(1)Design and application of machine learning algorithm: According to the demand of server fault prediction,the current typical fault detection algorithms are investigated,and used the fault detection algorithm based on Support Vector Machine classification.The algorithm is tested with data in the production environment.The test results show that the algorithm is effective and efficient.(2)Requirements and design of server fault monitoring system: Through investigation and interview of users,the requirements of the server lab are comprehensively analyzed base on server group management,data acquisition and calculation,and data visualization.Then,according to the requirements obtained,the overall system of server fault monitoring system is designed as a three-tier overall architecture covering the client bottom,data middle layer and visual user layer is proposed.(3)Server fault monitoring system implementation and testing: The current mainstream software framework is selected to implement real-time data acquisition and analysis platform software including the client bottom layer,the data middle layer,and the visible user layer.The platform collects and analyzes the real-time operating status of each server,and detects the fault status;at the same time,various real-time collected data and analysis results are displayed through rich visualization charts,and functions such as alarms and reminders are implemented.The fault monitoring system for large-scale server clusters is designed to provide support for early warning of the server,the real-time alarm of the fault event,and the subsequent fault tracing query.Through timely monitoring and discovery of fault points,the operation and maintenance management and emergency response ability of the whole system are effectively improved,and the availability and user experience of the application system are effectively guaranteed.
Keywords/Search Tags:Machine Learning, Fault Detection, Server, Support Vector Machine
PDF Full Text Request
Related items