Font Size: a A A

EMD And BoF Models Based Time Series Data Mining And Applications

Posted on:2019-04-08Degree:DoctorType:Dissertation
Country:ChinaCandidate:W P HuangFull Text:PDF
GTID:1318330545485714Subject:Control Science and Engineering
Abstract/Summary:PDF Full Text Request
Data mining on time series to obtain useful information is of practical significance.Compared with traditional static data,time series' valves are ordered in timestamp,while for the static data,the order of different valves is unimportant.Time series in the same batch may be of different lengths,while static data in the same batch has the same length.Moreover,time series has several characteristics such as large volume,high dimension,multi-features and strong noise.All these bring great challenges to data mining on time series,whether in precision or efficiency.Therefore,research on data mining on time series has great importance.This thesis focuses on the time series data.For problems in time series based data mining,this work is three folds:time series decomposition;dimension reduction;and time series classification.In addition,industrial system operation monitoring is investigate based on time series analysis.The detailed work are summarized as follows:1.In time series decomposition,the noise-assisted empirical mode decomposition method usually suffers from the high computational complexity problem.To address this issue,a partial noise assisted multivariate empirical mode decomposition(MEMD)method is proposed.High-frequency band-limit assistant noise is used in MEMD,thus assistant noise could restrain the mode mixing problem in high frequency,while bringing little disturbance to low frequency.Compared with other conventional approaches,this new method could achieve high-precision series decomposition and reduce the computational complexity.The numerical simulations show the efficient series decomposition performance of this new method.The application to fault diagnosis of bearing based on vibration signal analysis is also investigated.2.For the problems of high computation complexity and low accuracy in time series classification,a bag-of-features model based classification method is proposed.This method segments the time series by a data-driven segmentation approach,which retains the integrity of potential patterns;then the interval feature and normal cloud model feature are extracted,and Gaussian mixed model and fisher vector are utilized for feature coding.Finally,a linear support vector machine is used for classification.Compared with other state-of-the-art methods,this new method is of high classification accuracy and low computational complexity.The experiment on 43 UCR time series datasets verified its classification performance.3.The bag-of-features model based time series classification method is extended to multi-dimensional time series.The adaptively segmentation approach and local feature extraction are modified during this method.Experiments on multi-dimensional time series datasets show the performance of the modified method.In addition,this new method is applied to estimate operation state of an air conditioning system.The results show that it can recognize different operation states and accomplish fault classification for the air conditioning system.4.The high dimension of multi-dimensional time series leads to high computational complexity and deteriorates the classification performance.To handle this problem,a conditional mutual information(CMI)based feature selection method for multi-dimensional time series is developed.Firstly,the CMI is utilized to deal with complicated relationships between different features,such as relevance,redundancy and interaction.And a CMI based feature selection approach is proposed.Then the symbolic aggregate approximation(SAX)algorithm is used to achieve time series symbolization and introduces the information entropy to time series.Based on the CMI based feature selection and time series symbolization,a feature selection method for multi-dimensional time series is proposed.This method could effectively handle complicated relationships between different dimensions in time series.And its effectiveness is illustrated by experiment on several UCI multi-dimensional time series datasets.It is also applied to the air conditioning failure dataset to distinguish different variables.
Keywords/Search Tags:Empirical mode decomposition, Bag of features model, Time series classification, Adaptive segmentation, Feature selection, Conditional mutual information
PDF Full Text Request
Related items