Font Size: a A A

A Classification Approach For Uncertain Time Series

Posted on:2013-05-28Degree:MasterType:Thesis
Country:ChinaCandidate:J L WangFull Text:PDF
GTID:2298330467476174Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
With the development of science and economic progress, it becomes easier to get data, and the growth of the amount of data the people own has reached unprecedented rate. It is said that the "Big Data Times" is coming. On the other hand, with the development of technology of data management, people have a more thorough understanding of the uncertainty of data. For example, in the medical, environment monitoring, economic and other fields, the uncertain data can be found everywhere. Uncertain data management and mining has become one of the hot research points in database field, because of the rich information hidden in the uncertain data. It is more complex to represent, model and store uncertain data than to deal traditional deterministic data, because the value of uncertain data may have more than one possibility. So it is a challenge task to manage and mine uncertain data.Time series is an important data, and it has extensive application in economic analysis, physics, astronomy, medicine and speech processing and so on. In traditional deterministic data field, there are many approaches to manage and mine time series, and they achieve good effect. For example, there are two difference methods to model time series:1) Piecewise Linear Approximation, Piecewise Regression Approximation and so on, and they are based on the continuous representation of time domain;2) the method based on transformation, such as Discrete Fourier Transform and Discrete Wavelet Transform. For the similarity of time series, the main similarity functions are Minkowski distance, Edit distance and DTW distance and so on. But these methods can not be used in uncertain time series directly. Therefor, it need to improve the management and mining methods for the traditional data or design new methods to apply to the uncertain time series.Accroding to the characteristics of the uncertain time series, the new similarity distance, the expected distance, for the uncertain time series is proposed, which is the expectation of the random variable that the two uncertain points’ distance function produces. The expected distance can simplify the model of the uncertain time series and reduce the storage costs for the uncertain time series. The Minkowski distance and DTW distance for the uncertain time series are proposed based on the thought of the Minkowski distance and DTW distance for the traditional detertimistic time series. Meanwhile, they are used for the uncertain time series classifition. Four lower bound functions are proposed for DTW distance for the uncertain time series, which can reduce the computing cost of DTW distance. These lower bound functions are based on the characteristics of the expected distance and the thought of the lower bound function of the DTW distance for the traditional detertimistic time series. Finally, the experimental results show that the proposed classification approach of uncertain time series has high accuracy for the classification; and the lower bound functions of the DTW distance for the uncertain time series has very good pruning effect.
Keywords/Search Tags:time series, uncertain data, DTW, classification, pruning
PDF Full Text Request
Related items