Font Size: a A A

Research On Data Mining Technology Of Pattern-based Similarity Search In Time Series Database

Posted on:2007-07-15Degree:MasterType:Thesis
Country:ChinaCandidate:J ZhangFull Text:PDF
GTID:2178360212965630Subject:Computer applications
Abstract/Summary:PDF Full Text Request
Time series data is a kind of important complex data object, it exists widely in the natural phenomena and society economies. Applying data mining technology to analyze time series data is a job with practical significance. The main applications of data mining with time series method are rule discovery, periodic pattern mining, similarity search, sequence pattern discovery and so on. This text focuses on the similarity of the time series, that, finds the sequence data that are similar with the given time series, studies the standard of the time series'similarity and the fast search algorithm. The main work is listed as following:Time series data are of high dimension, noise, and various distortions such as amplitude scaling, stretching or compressing in the time-axis and so on. This brought about the difficulties of mining the time series data. Proposed the removing-noise and standardization processing, including the filling up the missing data, data cleaning, dimension reducing, eliminating the disaccord and so on.Presenting the key point of the time series curve transforming. The key point reflects the total trend and characteristics of the time sequence. Border-lining with the key point, fitting the segments linearly with maximum likelihood function and the least square method. The segments represent the time series transforming mode. Drawing fitted the characteristic vector of the segments.Storing the characteristic vector of each segment in the tree architecture, transforming the query sequence in the same way. This paper present the different but similar standard with the Euclidean distance. This paper use the time series fuzziness similarity as the similarity measuring standard, search and query the similar sub-sequence with the given sequence tree. This paper present a fast cut search algorithm. Compare it with the PAA (Piecewise Aggregate Approximation) to test the feasibility of the algorithm practical.
Keywords/Search Tags:Data Mining, Time Series, Critical Point, Feature of the Pattern, Fuzzy classification
PDF Full Text Request
Related items