Research On Nonlinear Time Series Clustering Algorithm Based On Centered Copula Process

Posted on:2022-10-01

Degree:Master

Type:Thesis

Country:China

Candidate:Y T Zhen

Full Text:PDF

GTID:2518306602965999

Subject:Statistics

Abstract/Summary:

PDF Full Text Request

Cluster analysis is the process of classifying data into different clusters,with the purpose of revealing the inherent properties and laws of the data.With the development of big data,a large amount of time series data has been accumulated through long-term detection and recording results in various industries,which has led to the problem of time series clustering.At present,most researches on clustering methods assume that the time series are only linearly dependent,but in some cases this assumption usually falls in practice.To overcome this limitation,in this thesis,we study clustering methods applicable to time series with a nonlinear dependent structure and propose two centered copula-based distances to measure dissimilarity among time series.The specific work is as follows:Firstly,we introduce the preliminary knowledge and basic theories which are closely related to this thesis.On the one hand,it includes the basic concepts of clustering,such as clustering steps,traditional clustering methods,commonly used distance measures and evaluation criteria for clustering results.On the other hand,we briefly introduce the concept and properties of copula function and its application in similarity measurement.Secondly,a clustering algorithm based on the centered Copula?CVM(Copula of Cram�rvon Mises)test statistic is proposed,which is suitable for clustering nonlinear time series data.In this method,the centered copula function is used to measure the superiority of the correlation between random variables,and the centered copula process is used to capture the dynamic dependency structure of time series.This distance measures the difference between two centered copula processes according to the Cramer-von Mises test statistic and consider a non-parametric estimator for it.The estimator has an equivalent form that is convenient for calculation,which improves the efficiency of the algorithm.At the same time,the strong consistency of the estimator is guaranteed,which expands the scope of application of the distance measurement.The simulation results of the hierarchical clustering algorithm based on centered Copula?CVM shows that the proposed distance of time series is not only suitable for nonlinear time series data,but also has high clustering quality for time series with linearly dependent structure.Finally,for the time series data types with a large lag in reality,a distance based on the centered Copula?WAD(Copula of Wasserstein and Anderson-Darling)is proposed as a similarity measure of time series.Anderson-Darling distance reduces the influence of noise data by assigning weights.The WAD distance combines the advantages of the Wasserstein distance and the Anderson-Darling distance,which makes our proposed clustering method based on the centered Copula?WAD distance can avoid the dependence on the lag order of the time series,so as to solve the clustering problem of the time series with a larger lag order.The effectiveness of the centered Copula?WAD distance can be verified by the simulation experiment results.At the same time,the clustering algorithm based on the proposed distance is used to cluster the population of major cities in China,and reasonable clustering results are obtained.

Keywords/Search Tags:

Nonlinear Time Series, Clustering Algorithm, Centered Copula Process, Distance Measurement, Correlation

PDF Full Text Request

Related items

1	Research On Periodic Time Series Clustering Analysis And Forecasting Method Based On Density Measure
2	Research Of Time Series Approximate Representation And Clustering Algorithm
3	Industrial Process Supervisory Operation Rule Mining Based On Time Series Hierarchical Clustering Methods
4	The Research On Estimation Of Distribution Algorithm Based On Copula Theory
5	Research On The Application Of Time Series Clustering Model In Finance
6	Multi-demension Time Series Modeling And Forcasting Analysis
7	The Time Series Similarity Clustering Algorithm Research
8	The Clustering Of Time-Series Data Based On LB_Hust Distance Caculation
9	Financial Time Series Clustering Based On Dynamic Time Warping
10	Research And Implementation Of User Behavior Time Series Clustering