Font Size: a A A

Dimension Reduction、Pattern Matching And Anomaly Detection For Multivariate Time Series

Posted on:2015-02-16Degree:MasterType:Thesis
Country:ChinaCandidate:H Y DongFull Text:PDF
GTID:2308330461473899Subject:Applied Mathematics
Abstract/Summary:PDF Full Text Request
Multivariate time series as a complex kind of time series form exists widely in various industries and fields, such as finance, hydrology, meteorology and production monitoring. It is a hot topic for machine learning and data mining researchers currently that how to find potentially useful information from huge amounts of multivariate time series. However, multivariate time series has its own characteristics: more noise, multiple variables, variables relevance and the times dimension of sequences is unequal, which makes many existing machine learning methods or statistical analysis methods difficult to use, so it has strong practical significance to more in-depth study and explore new analysis methods of multivariate time series.This paper focuses on these three mining tasks:dimension reductions in classification, pattern matching and anomaly samples detection. The prime work is summarized as follows:(1) To solve the problem that the time dimension of multivariate time series is unequal and variables are related, the paper puts forward a dimension reduction algorithm based on SVD and discriminant locality preserving projection (S_DLPP). Firstly, in order to transform the time dimension of unequal length sequences into sequences of equal length, the algorithm using the first singular vector by SVD represents the original samples. Then, it projects the sample vectors by utilizing discriminant locality preserving projections, which maintains the local manifold structure of sample vectors, while makes the most of categories information to make samples of the same class as close as possible and heterogeneous samples as dispersed as possible after projection. Comparing with the direct use of 2dSVD、LPP and SLPP dimension reduction methods, the S_DLPP algorithm achieves lower classification error rate under the condition of the same elapsed time overhead basically.(2) In view of the pattern representations of single linear PCA method and SVD method only pay attention to the time dimension, a pattern matching algorithm based on tensor multilinear PCA (TMPCA) is raised. The algorithm regards multivariate time series as a second-order tensor and uses tensor multilinear PCA to obtain the pattern representation that the time dimension and variable dimension are reduced, then the algorithm redefines the similarity measures between patterns by Frobenius norm. The experimental results show that TMPCA method achieves higher matching accuracy and lower elapsed time overhead on both small-scale and large-scale sequences.(3) Most of the existing anomaly detection algorithms of time series only apply to univariate time series which is in the form of vectors, but anomaly detection for multivariate time series which is expressed as matrix form is difficult to use the existing methods directly. Considering the high-dimension characteristics of time series, the paper generalizes the ADPP anomaly detection algorithm based on ε-neighborhood graph and an improved ADPP algorithm of anomaly samples detection for multivariate time series(IADPP) is proposed. IADPP algorithm introduces tensor similarity measure supporting for multivariate time series and uses it to construct the k-connection graph about samples, the k-connection graph can avoid the shortcomings that ε-neighborhood graph is dependent on the datasets density. The experimental results show that IADPP algorithm has a better detection results on multivariate time series datasets.
Keywords/Search Tags:multivariate time series, dimension reduction, similarity measures, pattern matching, anomaly samples detection
PDF Full Text Request
Related items