Font Size: a A A

A Number Of Issues In The Time Series Data Mining Research

Posted on:2009-05-21Degree:MasterType:Thesis
Country:ChinaCandidate:X F GuoFull Text:PDF
GTID:2208360245468762Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
Time series is a kind of important data existing in a lot of fields, such as stock, business distribution, weather, etc, and its storage size will blow up with process of the time. It is a challenging tusk to research on how to discovery valuable knowledge in such large-scale time series database, a research have theoretical significance and practical importance. In this thesis, research on several important techniques in time series data mining were carry out based on analysis of characteristic of time series data, considering its practical application requirements. Which include three aspects: feature patterns and relationship patterns mining; similarity pattern search; similarity search for multidimensional time-series.Aim at the research of mining feature patterns in time series Traditional data mining of time series were reviewed here. Several time series segments methods were compared and analyzed in this section. Quality of subsection algorithmic determines the shape of feature patterns, and impacts final purpose of data mining. Inter-Related Successive Trees (IRST) model method was used in the process of frequent patterns mining of time series. Time series are segmented based on critical points and symbolized in terms of domain knowledge and relative slope of each linear segment. Based on repetition and sequential of the feature pattern of time series, the corresponding arithmetic can find frequent patterns from multiple time series without generation lots of candidate patterns, efficient and useful. Experimental result comes out as a graphical describes, easy to apply in practice.Similar pattern search of time series was researched in three aspects: similarity measurement, storage configuration and maturity. Bases on IRST index model, segment dynamic time warping distance was used as measurement, combination with segmentation technique based on critical points, feature was extracted from all sub-sequences, and then time series are converted into meaningful symbol sequences in terms of the segment's features and MATH categorization. Similarity pattern was searched in time series using full text index technique. The method is proved no any false alarms or false dismissals. And experiments show that it has more efficient search and allows different lengths matching, compared with the previous methods.Multidimensional time sequences are an important kind of data stored in the information system. Similarity search is the core of their applications. On considering shape features of multidimensional time sequences, a kind of fast similarity search method was used, the shape features of subsequences or subsequences are subtly combined with spatial index structure (k-d tree) , which makes it possible to match shape of sequences or subsequences without any extra cost whiling searching the tree. The experimental result demonstrates that the algorithm is effective and efficient.
Keywords/Search Tags:Data Mining, Time series, Time Warping Distance, Sequence Inter-Related Successive Trees
PDF Full Text Request
Related items