Font Size: a A A

An Anomaly Detection Method For System Logs Using Venn-Abers Predictors

Posted on:2022-02-02Degree:MasterType:Thesis
Country:ChinaCandidate:L L PanFull Text:PDF
GTID:2568306488980369Subject:Engineering
Abstract/Summary:PDF Full Text Request
Logs are widely used to detect anomalies in modern large-scale distributed systems.However,due to the rapid increase of log size,the traditional anomaly detection which relies heavily on manual log checking becomes impossible.In order to reduce the manual workload,automatic log analysis represented by machine learning has been widely studied in the field of log anomaly detection in recent years.However,the conventional classification method only gives the prediction results,and lacks the validity evaluation of the prediction results,that is,the evaluation of the credibility of the prediction results and the guarantee of the validity of the evaluation.Probability prediction algorithms can provide correct probability of prediction results.However,these algorithms rely too much on the assumption of the sample distribution model.Once the assumption model is not correct,the probability of prediction is not accurate and the effectiveness of reliability evaluation cannot be guaranteed.In order to solve the above problems,this paper introduces Venn-Abers algorithm in the field of log anomaly detection for the first time.This algorithm can give the correct probability of the prediction result under the assumption of weak data distribution and detect the log anomaly according to the characteristics of the overall probability distribution.The main work is divided into two parts:(1)Venn-Abers algorithm is a flexible machine learning framework,which constructs three kinds of Venn-Abers predictors(SVM-VA,RF-VA,LR-VA)based on support vector machine(SVM),random forest(RF)and logical regression(LR)respectively and performs anomaly detection on HDFS log data.Firstly,the validity of probability prediction is proved by the characteristics of Venn-Abers algorithm.Then,an anomaly detection algorithm is proposed according to the probability distribution obtained by Venn-Abers algorithm.Finally,the classification accuracy is compared with three conventional classifiers,the ability of anomaly detection of Venn-Abers algorithm is proved.(2)Aiming at the performance difference of single models,a stacking multi-model fusion strategy based on support vector machine(SVM),k-nearest neighbor(KNN),decision tree(DT),random forest(RF)and gradient lifting decision tree(GBDT)is proposed.On this basis,a Venn-Abers predictor(Stacking-VA)is constructed for multimode fusion algorithm Stacking.Firstly,the multi-model fusion algorithm Stacking and single conventional classifiers are verified in terms of log anomaly detection effect and detection error.Then,it is proved that the constructed Stacking-VA predictor can better reflect the effectiveness of probability prediction than the single model Venn-Abers predictor.Finally,the ability of abnormal detection of Stacking-VA predictor is explained according to the proposed anomaly detection algorithm.It is further concluded that the accuracy of the detection log is improved by 2% compared with Stacking multi-model fusion.
Keywords/Search Tags:System Log, Anomaly Detection, Machine Learning, Venn-Abers, Model Fusion
PDF Full Text Request
Related items