Font Size: a A A

Research On Management System Of Internet Big Data Analysis Platform Based On Machine Learning

Posted on:2020-09-19Degree:MasterType:Thesis
Country:ChinaCandidate:J ShiFull Text:PDF
GTID:2428330575956334Subject:Information and Communication Engineering
Abstract/Summary:PDF Full Text Request
With the rapid development of the Internet,various network services have experienced explosive growth,and the Internet traffic data has skyrocketed.The operation and maintenance environment of the Internet big data analysis platform has become extremely complicated,and the amount of data generated is large and ever-changing.More and more indicators need to be monitored,which poses a great challenge to the traditional platform operation and maintenance mode and technical solutions.At present,the operation and maintenance personnel monitor the indicators of the platform through simple detectors or set fixed thresholds,and the thresholds are different for different business scenarios.As the platform business increases,the monitoring indicators are also increasing,and the thresholds that need to be set are also increasing.And manually setting the threshold brings a lot of misstatement and underreporting problems.The platform operation and maintenance personnel need to spend more time to monitor,troubleshoot and fix the problem.The operation and maintenance of the platform becomes very passive,and the operation and maintenance cost increase.Operation and maintenance based on r-ules and simple threshold settings can only cope with simple scenarios and is difficult to expand.In contrast,statistical methods and machine learning provide a more flexible expression that is robust and capable of coping with changing demand.This thesis introduces the application of machine learning in the operation and maintenance management of big data analysis platform.The first is the application of clustering method in the discovery of the original data distribution rule of platform and load balancing of acquisition server.The second is the application of regression algorithm in intelligent prediction of platform monitoring indicators and anomaly detection.Different from traditional image or text data,the experimental data in this thesis are time series,so the feature extraction and clustering methods are different from the traditional machine learning process.In this thesis,the original data of the platform is converted into time series data through data aggregation processing,and then compared with different distance calculation methods and clustering methods.For the monitoring data of the platform,this thesis fully exploits the feature of the time series,and expands the single time series features into multi-dimensional features with rich expressive power.LSTMs and machine learning regression models are used to predict the time series,the traditional time series analytical method ARIMA was compared.With accurate prediction ability,this thesis designs an anomaly detection algorithm and obtains satisfactory results through verification,which realizes anomaly detection of platform monitoring data.
Keywords/Search Tags:Time Series, Anormaly Detection, Machine Learning, BigData Platform
PDF Full Text Request
Related items