| Exploring the causality between things is an important topic in the field of data mining,and the discovery of causality based on time series is one of the important directions.The research results of causality have been applied in the fields of social economics,geometeorology and biomedicine,etc.,and are of great significance to all aspects of social life.At present,most causal relationship analysis methods often give conclusions about the causal relationship between variables from the perspective of a complete time series.However,there may be cases where the local data relationship does not match the conclusion,because the causal relationship between time series may change over time,exploring the causal relationship changes between time series is an important topic worthy of study.Based on the Granger causality analysis method suitable for time series,this thesis studies the causality changes of time series.The specific work includes the following three aspects:(1)Propose a weakly correlated time series interference entropy screening method based on the distinguishability of two-class of samples to evaluate the correlation between time series and filter out time series that are weakly correlated with other variables in the variable set.Firstly,the mixed conditional probability containing the characteristics of the two categories of samples(positive and negative)is calcualted;the attribution probability of the sample within the range of the positive and negative categories is calculated;and the confusion probability based on the mixed conditional probability and the attribution probability is calculated;finally,the two-class interference entropy value is calculated by the confusion probability to evaluate the distinguishability of two-class of samples on a certain feature.Under the multivariate time series,a method of converting two variable time series into two-class samples is designed to calculate the two-class interference entropy.If the interference entropy is small,the distinguishability between variables is large and the correlation is weak.The experimental results show that the new method is better than the comparison method in measuring the distinguishability of the two-class of samples,and it successfully screens out the time series that are weakly correlated with other variables under the multivariate time series.(2)Propose a difference-region balance method to explore the causal relationship of time series changes,from the perspective of data fluctuations,which is to solve the problem of causal relationship between time series changes over time.For the two variable time series,first,the fluctuation degree Sw of the current sliding window W as the fluctuation limit is calculated,and the fluctuation degree Su of the adjacent area U of the window W in the forward direction is calculated.Then,the forward exploration strategy is implemented: if Su does not exceed Sw,the difference-region balance detection scheme is implemented;if Su exceeds Sw,the symmetric region balance detection scheme is implemented.Finally,the multiple detection results of the window W are combined and outputed.Experiments show that the comprehensive performance of the new algorithm is better than the comparison method on the simulated data set and the real data set,and it has the advantages of higher accuracy and stable performance.(3)Proposed a sliding window residual ratio method to analyze the causal relationship of changes between time series.From the perspective of model relationship changes,it can solve the problem of changes in the causal relationship between time series over time.For the time series of two variables,the Granger method is used to identify the time series causality in the sliding window;for the sliding window Wt and the extended window Wt+s,Wt+2s,a complete regression model is established to calculate the residual ratio of the window,and then complete model fitting of window Wt is used to fit the extended window residuals with Wt+s and Wt+2s data to calculate the residual ratio of the extended window;sliding window residual identification criteria is designed to determine whether the window Wt+s is the relationship conversion interval;the relationship conversion point is calculated to modify the range of causality.Under multiple time series variables,the interference entropy screening method of weakly correlated time series variables is combined to avoid the analysis of weakly correlated variables.The experimental results on the simulated data set and the real data set show that the SWRR method has a higher or similar accuracy rate under different noise variances than the comparison method,and it has better stability under different window widths and moving steps. |