Font Size: a A A

Time Series Data Mining Based On Similarity Analysis

Posted on:2012-05-17Degree:MasterType:Thesis
Country:ChinaCandidate:R G FangFull Text:PDF
GTID:2178330332976025Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Time series is formed over time ordered data series, which frequently appears in financial, commercial, scientific and medical fields. How to manage and exploit these data and how to find the law and knowledge behind them, has become an increasingly interested problem.On the basis of the latest research at home and abroad, piecewise linear representation and multi-pattern matching are studied. The main work and innovation are as follows:1. Study time series representation, similarity measure and similarity search and make a detailed analysis on the main advantages and disadvantages.2. Define extrema noise and change point, and propose a piecewise linear representation method based on change point. The method chooses extremas as candidates and identifies noises on the points whose interpolation error are within the threshold. Experiments show that the method gains a smaller fitting error in a variety of data sets, and a high stability in processing large amounts of data.3. Propose an adaptive piecewise linear representation method based on change point. The method chooses change points as initial segmentation points, and then heuristically selects key points with the largest interpolation error. Experimental results show that the method can greatly reduce the fitting error in many fields of data sets.4. Propose an envelope lower-bounding algorithm based on piecewise aggregate approximation. This paper adopts the dimension reduction ability of piecewise aggregate approximation and proves the lower bounding theorem. From the theoretical analysis, when choosing an appropriate threshold, the algorithm's performance is better than classical algorithm and envelope lower-bounding algorithm. Thus the algorithm can handle streaming time series with higher bandwidth.
Keywords/Search Tags:Time Series, Similarity analysis, Piecewise linear representation, Extrema noise, Change point, Multi-pattern matching, Lower bounding theorem
PDF Full Text Request
Related items