| Anomaly Prediction is an important technique in the fields of data mining and machine learning that aims to predict potential anomalies in advance,in order to take appropriate measures to reduce potential losses.Time Series Anomaly Prediction,a branch of anomaly detection,is a time series data-based anomaly prediction technique that aims to predict anomalous points in time series data in advance.Time series prediction algorithms are widely used in various fields such as financial markets,manufacturing,the Internet of Things,healthcare,and natural disaster prediction.However,traditional time series anomaly prediction methods face many challenges,including:(1)severe imbalance in the proportion of positive and negative samples in the data.Anomalous samples often only account for a small portion of the dataset,and a large proportion of normal samples can cause overfitting during model training.(2)Most prediction methods have a lag.The high hazard of anomalies requires early warning before they occur in practice,in order to take targeted measures.However,most existing anomaly prediction algorithms can only predict after the anomaly has occurred or even during the occurrence,making it difficult to take timely preventive measures.(3)Existing anomaly prediction methods find it difficult to capture the potential dependency relationships between variables in the dataset.Time series datasets often come from multiple scenarios,and there may be underlying logical correlations between behaviors in different scenarios.Relying solely on expert knowledge to capture this potential logical topology information can be prone to omission.(4)It is difficult to accurately locate specific anomaly categories.The types of anomalies in real-world scenarios are becoming more and more diverse.Different types of anomalies often require different targeted measures.However,the similarity between anomalies often leads to traditional multi-classification algorithms to classify one type of anomaly into another category.This will interfere with subsequent preventive measures.To address these challenges,this paper proposes a graph embedding-based multi-task time series anomaly detection module.This module is based on common time series models such as LSTM,GRU,and Transformer,and designs four plug-in components to address the four challenges commonly present in anomaly prediction:(1)Temporal progressive sampling component: This paper uses progressive sampling to alleviate the problem of imbalance in the proportion of positive and negative samples in time series anomaly detection datasets and achieve early warning of anomalies,thereby reserving enough time to take preventive measures and reduce potential risks.(2)Graph embedding component: In order to better capture the logical information in the time series anomaly dataset and improve the predictive performance of the model,this paper introduces graph embedding technology,which can obtain the logical information in the dataset while perceiving the time series information.(3)Difficulty evaluation component: In order to improve the performance of the model as much as possible on existing data and allocate weights to various tasks based on their difficulty for subsequent multi-task learning,this paper introduces the idea of curriculum learning,designs a difficulty evaluator to evaluate the difficulty of samples based on prior difficulty,and trains samples in an easy-to-difficult order.This approach not only improves the performance of the model but also provides a basis for the allocation of task weights in subsequent multi-task learning.(4)Multi-task decision-making component: To address the problem of difficult identification of anomaly categories,we introduce a multi-task learning method,using the sample difficulty values obtained by the difficulty evaluation component to allocate weights to each task.Then,by sharing the common features between different categories of anomalies,the learning efficiency and prediction accuracy of task-specific models are improved.To validate the effectiveness of our method,we conducted experiments on multiple public datasets and industrial datasets.The experimental results show that our proposed module achieves good performance in time series anomaly prediction tasks and exhibits good robustness in different scenarios.We also compared our method with other existing anomaly prediction methods,and the results show that our method has better prediction accuracy and robustness.In summary,this paper proposes a graph embedding-based multi-task time series anomaly prediction module.This module combines graph embedding technology and multi-task learning methods,effectively addressing the challenges faced by traditional time series anomaly detection methods.The experimental results show that our method has good performance and robustness.More importantly,our method has been successfully deployed on Microsoft’s M365 and Azure cloud platforms,helping Microsoft greatly reduce data and economic losses caused by disk failures. |