Font Size: a A A

Anomaly Detection And Root Cause Analysis Based On Multivariate Metrics In Cloud System

Posted on:2023-10-23Degree:MasterType:Thesis
Country:ChinaCandidate:R Z ZhuangFull Text:PDF
GTID:2558306830454844Subject:Computer Science and Technology
Abstract/Summary:
With the development of the cloud system,more and more people apply their applications and tasks on the cloud system.So how to ensure the reliability and fault tolerance of the cloud system has become an essential task to improve user experience and the environment of cloud.In recent years,while artificial intelligence is popular in various fields,academia and industry have begun to use machine learning to perform anomaly detection and root cause analysis on a large number of multivariate metrics generated by cloud system,which help people detect the status of the cloud system and locate the location of anomalies fast and accurately.However,most of the existing anomaly detection and root cause analysis methods based on multivariate metrics rely on the learning of historical data features.There are some main challenges as follows: First,the imbalance and unlabeled nature of cloud system data takes challenges to the training of anomaly detection models.Second,the complex dependencies between metrics and the temporal features of metrics make it difficult to analyze metrics.Third is how to locate the root cause metric among hundreds of metrics on the cloud system efficiently.For the above problems,this paper aims to eliminate the noise labels in unlabeled and unbalanced datasets,establish a more accurate unsupervised time series anomaly detection model and root cause analysis method.The specific work is summarized as follows:(1)Aiming at the problem of removing noisy labels in unlabeled and imbalanced datasets,this paper proposes a time series denoising method based on stacked autoencoders and KMeans++.The method extracts the original data features for dimensionality reduction by stacking autoencoders(SAE),and then uses the K-Means++ clustering method to process the noisy labels in the dataset.This paper demonstrates the effectiveness of this method by conducting experiments on public datasets and comparing to other machine learning methods.(2)Aiming at the problem of anomaly detection of multi-dimensional time series metircs,this paper proposes an unsupervised time series anomaly detection model based on stacked long short-term memory neural networks and multi-objective generative adversarial networks,and uses the denoising method as the data preprocessing.The method captures the dependencies and temporal features between metrics by using stacked long short-term memory neural networks as the base network,and utilizes multiple generators to improve the accuracy of generative-based anomaly detection models.Finally,the superiority of the proposed method compared with other anomaly detection models is proved by experiments.(3)Aiming at the problem of root cause analysis of multivariate metrics,this paper proposes a suspicious metric localization method based on a two-stage feature selection algorithm.In this method,a large number of indicators are screened by Relief F,and then the weights of metrics are calculated and sorted by the SVM-RFE algorithm to obtain the suspicious metric which leads to the anomaly.Finally,experiments are carried out on the cloud system to prove the effectiveness of the proposed method on the multivariate metrics.
Keywords/Search Tags:Cloud system, Multi-dimensional metrics, Unsupervised, Anomaly detection, Feature selection algorithm
Related items