Font Size: a A A

Research On Modeling Forecast And Anomaly Detection Method Of Time Series Stream Data

Posted on:2021-05-05Degree:MasterType:Thesis
Country:ChinaCandidate:G X MaFull Text:PDF
GTID:2428330611453440Subject:Systems Engineering
Abstract/Summary:PDF Full Text Request
Time series is a collection of data of an observed variable collected in chronological order,and widely exists in the fields of finance,power load,and process control.In the stream data environment,the time series presents the characteristics of infinite mass,single-pass scanning,real-time arrival and noise-related.Mining the running rules of time series stream data through real-time modeling,and analyzing the abnormal patterns hidden in the data based on modeling and prediction,can have a more positive impact on actual production and life.In view of the problem that most existing static and offline time series modeling and prediction algorithms cannot perform real-time analysis in the context of streaming data,this article focuses on how to select training samples in the modeling and prediction of time series stream data to meet real-time requirements and to improve the forecasting accuracy of model,an algorithm based on GEP was designed,and a double sliding window,colony climbing algorithm and data fusion method were added to realize real-time modeling and forecasting of time series stream data.Using 4 data sets with Gaussian noise of different degrees as the test data set,the prediction results on the test data sets of the proposed algorithm and HTM algorithm under the real-time requirements of the same data transmission interval are compared experimentally.The Mean Absolute Percentage Error(MAPE)value is used as an indicator to judge the prediction accuracy of the algorithm.Experimental results show that the overall MAPE values of the proposed algorithm on the four data sets is lower than that of the HTM algorithm,indicating that the proposed algorithm has higher prediction accuracy than the HTM algorithm.Given that most of the existing time series anomaly detection algorithms are used to process data in batches,they cannot be directly applied to anomaly detection in stream data environment,and the time series anomaly detection method based on time window distribution and classification in the existing literature only focuses on the detection of anomalous outliers in the time series space,and does not take into account anomalous data points that do not meet the running rules of time series context data.Based on the proposed real-time modeling and prediction algorithm of time series stream data,an anomaly detection method that can adaptively change the threshold of the detection model is designed to solve the problem of threshold setting of the anomaly detection model in the existing literature,which mainly judges whether the actual data is abnormal data based on whether the absolute value of the difference between the predicted value generated by the prediction model and the actual observation value exceeds the threshold.The proposed algorithm and the ARIMA algorithm were used in anomaly detection comparison experiments on four data sets with different percentages of outliers.The accuracy and stability of the algorithm were evaluated by the two indicators:recall and false alarm rate.The experimental results show that compared with the ARIMA algorithm,the proposed algorithm has a higher recall rate and lower false alarm rate on 4 data sets with 2%and 4%outliers,indicating that the proposed algorithm has higher anomaly detection accuracy and stability.
Keywords/Search Tags:Time series stream data, Data modeling, Sliding window, Data fusion, Anomaly detection
PDF Full Text Request
Related items