| KPI Anomaly Detection has always been a very important task in the field of AIOps(Artificial Intelligence for IT Operations).In large Internet companies(especially communication company operators),engineers in the operation and maintenance department generally judge whether the service is stable by monitoring various KPIs(Key Performance Indicator)associated with software or equipment in the backend.If the KPI is abnormal,it often means that the related equipment or application also has problems.In the past,traditional operation and maintenance methods relied on the personal work experience of operation and maintenance engineers and manual verification with the help of some scattered toolkits when problems occurred.However,with the increasing complexity of business,this can no longer meet the current needs of rapid operation and maintenance management,and AIOps concept are born.AIOps aims to meet operational needs by applying methods and means in the field of artificial intelligence.To build an effective KPI anomaly detection system that can be deployed to real business scenarios faces many challenges.The first is that the frequency of anomalies is very low,resulting in very little anomaly data for model analysis.Secondly,the anomaly definitions in different application scenarios are not the same.It is precisely because of the above difficulties that the precision and recall of existing anomaly detection algorithms and models are not high,and there are a large number of false positives and false negatives.For the problem of rare abnormal samples,the general method is to reprocess the data set to make the newly constructed data set class balanced.This approach results in either data droped or data redundancy.Some machine learning and neural network models can also weight abnormal samples during training to enhance the influence of abnormal samples.In the past,when a new anomaly appeared,a special detection model was generally customized,and it was impossible to automatically discover anomalous patterns from data like machine learning.In this paper,the Transformer model is improved to be specially used for Anomaly Detection of Time Series,and this model is named Time Transformer E.First,the Time2 Vev method is used to embed the time series data into a high-dimensional space for subsequent Transformer Encoder input;then,for the class imbalance problem in anomaly detection tasks,Focal Loss is used.Compared with the traditional neural network,this model uses a total of 2 loss functions and uses the learning rate warm-start technique.Finally,this model is tested on the public dataset.The experiments show that when the Time Transformer E anomaly detection model is on the real KPI dataset,the evaluation metrics(accuracy,precision,recall and F1 score)are all above 0.96,far more than other methods. |