Font Size: a A A

Clustering, Similarity Search And Outlier Detection In Multivariate Time Series

Posted on:2010-03-03Degree:DoctorType:Dissertation
Country:ChinaCandidate:D Z ZhouFull Text:PDF
GTID:1118360302495090Subject:Information management and information systems
Abstract/Summary:PDF Full Text Request
Multivariate time series datasets are very common in various fields such as multimedia, medical and financial applications, etc. But with the passage of time, the data of time series will increased dramatically, so it is an important and challenging practical research to manage and apply these time series database, and to discover its inside changing patterns.The research is helpful for us to discover the developing principle of things, to make decisions scientifically and to detect the outlier etc. The thesis provides a study on several technical problems such as clustering, similarity search and outlier detection for multivariate time series on the basis of the analysis of the features of multivariate time series and its application. The following is a brief introduction of the findings:1.Pattern Representation of Multivariate Time SeriesA representation method of the pattern of multivariate time series based on the analysis of the principal Component was put forward. Due to large amount and complex data character of Multivariate time series, data mining directly on raw time series is time-consuming and inefficient. Sometimes, the accuracy and reliability of mining results will descend, so a Principal Component Analysis Representation of multivariate time series is used to improve the situation.2.Clustering of Multivariate Time SeriesAn efficient clustering algorithm for Multivariate Time Series—PCA-CLUSTER is proposed. Most of the existing algorithms adopted K-means method to cluster low dimension data, which are not suitable to address the problem of clustering high dimensional Multivariate Time Series data.The algorithm applies Principal Component Analysis to reduce the dimension of MTS, and subsequently choose the principal component series of MTS to cluster.3.Similarity search of Multivariate Time SeriesA distance-based index structure (Dbis) for similarity search is presented.In order to efficiently perform similarity search for Multivariate Time Series datasets, with the Eros similarity measure, this algorithm allows the MTS items to be indexed by using a B+-tree structure.4.Outlier detection of Multivariate Time SeriesMTS samples which differ significantly from the remaining MTS series are referred to as outliers. A new time series outlier detection algorithm of high-efficiency was proposed based on the foundation of k-nearest local outlier pattern detection algorithm and PCA.
Keywords/Search Tags:Multivariate Time Series, Similarity Measure, Clustering, Similarity Search, Outlier Detection
PDF Full Text Request
Related items