Font Size: a A A

Efficient Periodic Pattern Mining in Time Series & Sequence Databases

Posted on:2012-12-21Degree:Ph.DType:Dissertation
University:University of Calgary (Canada)Candidate:Rasheed, FarazFull Text:PDF
GTID:1468390011461446Subject:Computer Science
Abstract/Summary:
Periodic pattern mining involves identifying all the patterns which exhibit either complete or partial cyclic repetitions in the time series or sequences. Periodic pattern mining or periodicity detection has a number of applications such as prediction, forecasting, detection of unusual activity, etc. The diversified nature of the problem increases the complexity of the proposed solutions; periodic patterns can be of any size, can start and end at any position in the series, can exhibit any period, the series may itself contain any mixture of replacement, insertion and deletion noise and the series can be very large. Moreover, the periodic patterns can be surprising and a series might be sampled non-uniformly. The target of this PhD dissertation is to develop a time and space efficient and noise resilient approach, which uses suffix tree as an underlying data structure and can accurately detect all periodic patterns in the time series that confirms to the diverse nature described above. The approach developed ignores redundant periods and reports only unique periods. The approach has been tested extensively both with real and synthetic data sets. The proposed approach has been tested with time series, sequences, data from bioinformatics and stock market. With synthetic data, different aspects of the technique such as accuracy, time and space efficiency, scalability, and robustness are tested. A comparative analysis with other existing prominent approaches show that the proposed approach is more time and space efficient, it can mine larger sequences, it is more noise resilient and it results only in unique periodic patterns compared with other approaches. Application of the periodicity detection approach to various domains, including biological data (such as DNA, and protein sequences) and stock market data is also analyzed accompanied with experimental results.;Keywords: time series, periodicity detection, suffix tree, symbol periodicity, partial periodic patterns, full-cycle periodicity, noise resilience, DNA sequence analysis.
Keywords/Search Tags:Time series, Periodic, Pattern mining, Data, Efficient, Noise
Related items