Font Size: a A A

Research On Multivariate Time Series Data Clustering And Anomaly Detection Algorithms

Posted on:2021-02-15Degree:MasterType:Thesis
Country:ChinaCandidate:W X DingFull Text:PDF
GTID:2428330647950732Subject:Computer technology
Abstract/Summary:PDF Full Text Request
Mining meaningful information from multivariate time series(MTS)is a common task in many application fields,including network services,industrial systems,health care,aerospace,finance,meteorology,bioinformatics,etc.A MTS is composed of a series of consecutive vectors.Its variables often have interrelationships and jointly reflect the status of an entity(such as network services,industrial equipment).MTS data mining is used to understand the complex MTS data and provide some helpful hints.MTS data mining is an extremely active research field.With the explosive growth of information,many existing works are difficult to deal with the huge and complex MTS data.With prevalence of big data and deep learning,MTS data mining based on big data,machine learning,and deep learning has been widely studied and applied.MTS data mining has many specific tasks.This paper focuses on the following two issues:(1)MTS clustering,that is,multivariate time series data sets are partitioned into different groups according to their similarities.The data is similar to each other within the group,but different to that outside the group.Since MTS contains multiple variables that interact on each other,MTS clustering task is more complicated than univariate time series clustering.(2)MTS anomaly detection,that is,founding the time series that violate the normal pattern.Since MTS data records the information of the entities in a lot of applications,anomaly detection of MTS data is essential for the quality management and risk control.Moreover,since MTS data are often highdimensional and complicated,it increases the difficulties of the task.To address the MTS data clustering problem,this thesis proposes a time-varying Gaussian Random Markov field based MTS clustering algorithm called T-GMRF.TGMRF uses Gaussian Markov Random Fields(GMRF)to describe the inter-correlationbetween variables;uses principal component analysis(PCA)to project high-dimensional GMRF sequences to low-dimensional feature vectors;and adopts multi-density based clustering to obtain the clustering assignment.Extensive experiments based on three open MTS datasets show that the proposed T-GMRF method is obviously superior to the state-of-the-arts in various metrics.Further,comparison experiments also validate the effectness of data dimension reduction and multi-density clustering.To adress the MTS data anomaly detection,this thesis proposes a bidirectional recurrent generation adversarial network based MTS anomaly detection algorithm called BR-GAN.Its core idea is to capture the normal patterns of MTS data by learning their robust representations with key techniques such as bidirectional recurrent generative adversarial network,encode-decode-encode network structure,Wasserstein distance measuring the difference between the model distribution and the MTS data distribution,adversarial training procedure,anomaly sores that combine reconstruction error of MTS data and hidden representation.Extensive experiments based on three open MTS anomaly detection datasets show that the proposed BR-GAN is significantly better than the state-of-the-arts in detection performance.Further experiments show that this method has better performance over state-of-the-arts in terms of noise robustness,training and inference efficiency.
Keywords/Search Tags:Multivariate time series, Clustering, Anomaly detection
PDF Full Text Request
Related items