Font Size: a A A

Research On The Structural Complexity And Similarity Of Time Series

Posted on:2022-10-03Degree:DoctorType:Dissertation
Country:ChinaCandidate:J Y HeFull Text:PDF
GTID:1480306560989329Subject:Statistics
Abstract/Summary:PDF Full Text Request
Traditional time series analysis often takes linearity and stationarity as assumptions and premises.However,the real world is always nonlinear and non-stationary.The real world is formed by complex systems,which restrict and depend on each other.It is usually difficult to obtain the underlying principles and operating mechanisms of complex systems directly.As important carriers,nonlinear and nonstationary time series are important channels to analyze complex systems.In this paper,we focus on the structural complexity and similarity of complex time series to explore the internal mechanism and interrelationship of complex systems.This paper is based on the theory of probability and information entropy,taking the dynamic structural characteristics of time series as the starting point.We construct new metrics and methods such as the q-average of the sample entropy(qSampEnAve),q-average of the sample entropy difference(qSEDiffAve),global recurrence quantification analysis(GRQA),dynamic Shannon entropy(DySEn),a PDF-induced distance based on permutation cross-distribution entropy(PID),and improved multi-dimensional scaling method based on Kronecker-delta dissimilarity(MDSK)to analyze the complexity and similarity of the time series.The content of this paper mainly includes the following four aspects:1.We propose the qSampEnAve based on Tsallis entropy,and use it to measure the disorder of the sequence.It reveals that the more disordered the sequence,the higher the qSampEnAve value.On the basis of qSampEnAve,we further propose the qSEDiffAve,which measures the structural complexity of the sequence by quantifying the information difference between structured and random sequences.Studies have shown that sequences in the intermediate state of order and disorder have high structural complexity,while completely disordered sequences and completely ordered sequences have relatively low structural complexity.We use the qSEDiffAve method to analyze the heartbeat interval sequences of patients with congestive heart failure(CHF),patients with atrial fibrillation(AF)and healthy individuals.The results show that the heartbeat interval sequence of healthy individuals have higher structural complexity,while the heartbeat interval sequence of patients with AF and CHF have lower structural complexity.2.We propose the global recurrence quantification analysis(GRQA).By quantifying the structural information of the recurrence plot under different thresholds,the curve of each statistic varying with the threshold can be obtained,so as to visually analyze the structural differences between different sequences.Traditional recurrence quantification analysis(RQA)usually only studies the recurrence structure under a certain threshold,so the selection of the threshold affects the results.GRQA obtains more comprehensive structural information by considering all thresholds,which is more reliable than the single value of traditional RQA.Therefore,this method is also an important supplement to traditional RQA.Studies have shown that the recurrence rate(RR)curve reflects that the threshold has an important impact on the richness of recursive structures,especially for aperiodic sequences,where the richness of recursive structures changes rapidly with the threshold,which further shows that traditional recurrence quantification analysis is unreliable.The laminarity(LAM)reflects the stability of the sequence,while the deterministic(DET)reflects the certainty of the sequence,and the Rényi entropy(RENTR)reflects the richness of the diagonal structure.These curves can distinguish different types of sequences in detail,which cannot be achieved by traditional RQA.In the GRQA analysis of financial data,it is found that the Hang Seng Index(HSI)has the structural characteristics of the Shanghai Composite Index(SSE),Shenzhen Component Index(SZSE),US Nasdaq Index(NASDAQ)and US Dow Jones Index(DJI),revealing the diversified background of the financial market environment in Hong Kong,China.3.We propose the permutation distribution entropy(PDE),and discuss the structural characteristics of sequences by paying attention to the fluctuation mode of each state vector in the reconstructed phase space.The PDE method can perceive the local periodic changes of complex time series.The stronger the periodicity,the higher the PDE value.Therefore,this method is an important indicator for structural changes.On this basis,we further propose two metrics to study the structural complexity and similarity of time series:(1)We propose dynamic Shannon entropy(DySEn)according to the characteristics of PDE and Shannon entropy.We combine this method with sliding window and use it for anomaly detection.When DySEn>0.6,it is defined as an abnormal area.It shows that the DySEn can identify local fluctuations in random sequences,chaotic sequences,periodic sequences and mixed sequences.The ability of recognition is irrelevant to whether the abnormal area is visible to the naked eye,whether the detected sequence is periodic,and whether the abnormal area is periodic.This method has the characteristics of Shannon entropy and PDE at the same time,and even when these two methods do not respond to anomalies,DySEn can still accurately identify the anomaly area.We apply this method to the corrugated detection of railway,and it is proved that DySEn has important guiding significance for corrugated detection.(2)We propose a PDF-induced distance based on permutation cross-distribution entropy(PID).By introducing the penalty parameter PIPE,the entropy calculation problem caused by the symmetry of the distribution function is solved,and the accuracy of the similarity measurement between complex sequences is greatly improved.Research shows that the PID method is superior to the traditional Euclidean distance and detrending cross-correlation analysis(DCCA),with stable performance and noise immunity,and the parameters have no significant influence on the reliability of the method.It is a reliable indicator for measuring the similarity of complex time series.When we apply this method to stock data analysis,we can obtain clustering results that are consistent with the actual financial market environment.It is shown that this method can more effectively achieve clustering of complex sequences,and is resistant to noise.4.We propose an improved multidimensional scaling method based on Kroneckerdelta distance(MDSK).This method uses a new distance metric Kr D based on the structure of sequences to define the distance,which is more suitable for complex time series than the traditional Euclidean distance.The fitting effect and goodness of this method in low-dimensional space far exceed the traditional MDS method and the multi-dimensional scaling method based on other commonly used distance measures or similarity measures.In addition,it is found that the PID method we proposed before can also provide an important reference for the expansion of the multi-dimensional scaling method in the field of nonlinear and complex time series.
Keywords/Search Tags:Complex time series, nonlinear dynamics, complexity, similarity measure, information entropy, sample entropy, permutation distribution entropy, multidimensional scaling
PDF Full Text Request
Related items