Research On The Similarity-Based Time Series Data Mining

Posted on:2008-05-04

Degree:Master

Type:Thesis

Country:China

Candidate:X M Lu

Full Text:PDF

GTID:2178360215462604

Subject:Control theory and control engineering

Abstract/Summary:

PDF Full Text Request

A time series is a data sequence of observations which are ordered in time, which exists in various fields extensively, such as industry, economy, finance, science observing and social science, etc. How to manage and use these time series data efficiently is an interesting problem. Classical time series analysis always proposes a hypothesis first, and then it proves its validity, which is not suitable for discovery task. Time series data mining can extract hidden and potentially useful knowledge from large amounts of data which maybe omitted by users.The thesis addresses the research on the similarity-based time series data mining, which covers the representation method, similarity measure, similarity searching and index structure of the time series and the prototype system of the time series data mining. The main works and contributions of this thesis include:1. The representation method of Segmented Extreme Value Extraction is proposed. Different to the traditional representations, it can depict the whole trend and local features of a time series correctly at the same time. It is designed by Piecewise Linear Representation and Landmark Model. And the related experiments have proved its correctness and high efficiency.2. A new method of similarity measure based on Segmented Extreme Value Dynamic Time Warping (SEDTW) distance is proposed. SEDTW distance is an effective method of the time series by scaling and warping along the time-axis. It divides time series into several segments and extracts the extreme values in each segment, and then measuring the new extreme value series on the dynamic time warping (DTW) distance. Compared with the classical DTW distance, this new method is much faster in speed and is almost no degrade in accuracy. This conclusion can also be proved by the experiments in this thesis.3. The similarity searching based on DTW distance in the time series database is studied. The thesis firstly uses R*-tree as the index structure of the time series database in order to improve the searching efficiency. Then it searches the similar series along the R*-tree index structure by the DTW distance. Both the whole matching and subsequence matching algorithms have been implemented in this thesis.4. An integrated framework of the time series data mining prototype system is proposed. This framework is composed by functional units and flexible interfaces. It can support various synthesis-advanced services, such as mining patterns and similarity searching.

Keywords/Search Tags:

time series, time series data mining, similarity measure, DTW distance, SEDTW distance, similarity searching

PDF Full Text Request

Related items

1	Research On Feature Representation And Similarity Measure Methods In Time Series Data Mining
2	Research And Application Of Time Series Similarity Pattern Mining
3	Research On Mining And Similarity Searching In Time Series Database
4	Time Series Similarity Search Based On Adaptive Cost Dynamic Time Warping Distance
5	Study On Similarity Query Over Time Series Data
6	Multivariate Time Series Similarity Analysis Method And Application In Data Mining
7	Study On Water Quality Time Series Data Mining And Application Integration
8	Research Of The Algorithm Of Similarity Query And Classification With Shapelets Based On DTW Distance
9	Research On Uncertain Time Series Similarity Matching
10	Research On Mining And Similar Searching In Time Series Database