Font Size: a A A

The Design And Implementation Of Business Capacity Management And Control Subsystem Of Application Performance Management Monitoring Platform

Posted on:2020-04-05Degree:MasterType:Thesis
Country:ChinaCandidate:Y LiuFull Text:PDF
GTID:2518305741980439Subject:Master of Engineering (field of software engineering)
Abstract/Summary:PDF Full Text Request
The application performance management monitoring platform is JD’s internal monitoring and analysis of all applications under the micro-service architecture to ensure the normal operation of the system,enabling users to obtain high-quality services,enabling R&D to see services,performance,and components through the platform anytime,anywhere.Multi-dimensional monitoring information such as basic services,so as to quickly understand the operation status of related services and make timely adjustments.As an e-commerce platform,JD faces the peak traffic during the period of promotion,and the expansion of some core services and the capacity degradation of non-core applications are important functions of the application performance management platform.Therefore,various resource consumption indicators for the future operation status of the system Forecasting is the most important step in this feature.Relying on the previous operation and maintenance experience for manual expansion and contraction requires too much manpower and often requires secondary expansion and secondary degradation due to low prediction accuracy.The establishment of machine learning-based predictive models is more adaptable than traditional methods.This topic applies the machine learning model to the system resource consumption index prediction in the application performance management platform,and establishes the business capacity management and control system.In the thesis,the background knowledge and related technologies are introduced firstly.Then the overall structure of the application performance management monitoring platform is contacted.The location and working scenario of the business capacity management and control subsystem in the whole platform are expounded,and the specific requirements of the business capacity management and control subsystem are analyzed,summed up the function points of the system.In the design and implementation part of the following algorithm model,the whole subsystem is divided into four modules:data acquisition and preprocessing module,application classification module,capacity estimation module,and system expansion and contraction module.And from the application scenario,algorithm design,algorithm implementation,algorithm effect comparison,algorithm key code,frontend interface and other dimensions are introduced.The thesis has established different machine learning models for different demand points in the business capacity management process combined with different machine learning algorithms.From the preprocessing of data to the classification of applications,to the critical steps of the application of capacity resource consumption,each of the key steps is closely combined with the business scenario to establish a machine learning model.The isolated forest algorithm is used to clear the noise in the daily monitoring data;the sliding window algorithm is designed to construct the data structure with different time granularity;Application of linear regression algorithm and LSTM(Long-Short Memory Network)algorithm to establish business capacity estimation model;obtain the estimated value and finally combine the expansion and contraction strategy to accurately expand and shrink different services.Since the launch,it has experienced a peak traffic test of double 11 and the system is running well.According to statistics,the accurate capacity estimation combined with the comprehensive expansion and contraction strategy saved the company more than 7,000 nuclear machine resources and liberated 1.5 human resources during the promotion period.The operation and maintenance efficiency have been greatly improved,and the basic needs of the platform for the enterprise have been achieved.
Keywords/Search Tags:Application Performance Management, Capacity Estimation, Machine Learning, Resource Expansion and Reduction
PDF Full Text Request
Related items