Font Size: a A A

Multi-granularity Intelligent Analyzing For Time Series

Posted on:2018-09-14Degree:DoctorType:Dissertation
Country:ChinaCandidate:W H DengFull Text:PDF
GTID:1318330542479700Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
With the rapid development of technology in Internet and cloud computing,the various data emerge explosive growth trend in many fields.How to analyze these data intelligently and mine some potential-valuable information from these data becomes one of the main tasks in the big data-intelligent era.In these complex types of data,one kind of series data is recorded in chronological order,called time-series data.The current time-series data usually have the inherent features of high-dimensionality,uncertainty,dynamism,etc.High-dimensionality has different interpretations in the horizontal and vertical direction.It denotes the attribute dimension in the horizontal direction,and refers to the length of time series in the vertical direction when the time series is regarded as one sample.Uncertainty usually contains fuzziness,incompleteness,randomness,etc.The feature of dynamism means that time series is accumulating,incremental-updating,and dynamic evolution constantly.These features increase the difficulty of time-series data mining.Focusing on the three characteristics,this thesis studies time-series dimensionality reduction,similarity measure,prediction,and anomaly detection by introducing the idea of multi-granular computing(MGrC).More specifically,the main research contents of this thesis include the following aspects:(1)The granulating and dimensionality reduction of time series based on two-dimensional normal cloud.To address the high-dimensionality and uncertainty of time series,this thesis proposes a novel time-series dimensionality reduction method,namely Two-Dimensional Normal Cloud Representation(2D-NCR),which granulates one time series into several two-dimensional normal clouds and uses the characteristics of cloud model to represent the time series.It can achieve dimensionality reduction efficiently by considering the data distribution and variation of the time series simultaneously.In the granulated level,a new similarity measure of time series based on 2D-NCR is presented.Its problem solving thinking based on the "divide-calculate-combine" strategy is consistent with the general regularity of analyzing the complex problems in human cognition.The experiments of time series classification and clustering show that the proposed methods can decrease the error rate significantly.(2)Multi-Granularity Fuzzy Time Series prediction model(MGFTS)based on automatic clustering and Particle Swann Optimization(PSO).To address the issue of the fuzziness and incompleteness in multi-factors time-series prediction,this thesis proposes a MGFTS model based on automatic clustering and PSO,which achieves the partition of the universe of discourse by using the automatic clustering,adopts the granular computing thinking to handle the attributes' missing values,and employs the PSO to do the task of multi-granularity combined prediction.Through the selection and jointly computing of multi-granularity level,the MGFTS model can better utilize the relevance between the main factor and the secondary factors,and address the problem of attributes' missing values efficiently,making it can obtain higher prediction accuracy.(3)Multi-granularity water quality prediction model based on Gaussian Cloud Transformation and Fuzzy Time Series(GCT-FTS).For the problem of "this and that"in the prediction process caused by the uncertainty of water quality time series,this thesis proposes a GCT-FTS model combining with the approximate periodicity of water quality time series.It employs the Gaussian cloud transformation method to granulate the numerical time series for obtaining the partition of universe of discourse.This soft partition method can solve the uncertainty of "this and that" for the border region between two adjacent partitions.In the stage of constructing the fuzzy logical relationship,the GCT-FTS model employs the approximate periodicity of water quality time series,which can remove the fuzzy logical relationships that may cause negative effects and improve the prediction models' accuracy and robustness.(4)Online detection of abnormal time series.For the offline time-series dataset,this thesis presents a Heuristic Density-Attractor based Anomaly Detection algorithm(HDA-AD),which assigns an "anomaly score" to each instance based on the density-attractors set and uses the "anomaly score" to quantify the abnormality of the time-series instance.Iin the HDA-AD algorithm,the density-attractors are the local maximum values of the time-series instances' density function in dataset,and the"anomaly score" not only can quantify the "nonconformity" of the candidate time series in dataset,but also consider the "influence" of the candidate time series to others.Then to adapt the dynamics of time series,we propose an online detection approach of abnormal time series instance by Online Learning Density-Attractor(OLDA-AD),based on the HDA-AD algorithm.It uses the new instance's neighborhoods to online update the significant density-attractor set,making the set persistently capture the essential characteristics of the incremental dataset.The experimental results show that the proposed algorithms can achieve a competitive accuracy and detection delay in anomaly detection.These findings show that employing the idea of MGrC to address the high-dimensionality,uncertainty,and dynamism can achieve the efficient intelligent analyzing for time-series data.
Keywords/Search Tags:Time-Series Data Mining, Multi-granularity Computing, Dimensionality Reduction, Similarity Measure, Fuzzy Time Series Prediction, Anomaly Detection, Online Detection
PDF Full Text Request
Related items