Font Size: a A A

Research Of Data Mining On Dynamic Data

Posted on:2003-12-29Degree:DoctorType:Dissertation
Country:ChinaCandidate:S Y GuoFull Text:PDF
GTID:1118360092475612Subject:Control Science and Engineering
Abstract/Summary:PDF Full Text Request
Focusing on the problem of data mining in time-series, did research in transforming time-series to trend sequences and methods of performing data mining in acquired trend sequences.Main work and results of the paper are as following:1) Proposed definitions of trend sequence, or shortly trend, and related concepts. Showed that trend sequences are essentially a kind of strings which could describe interesting information in time series in a more abstract manner.2) Discussed the selection of time-series-to-trend-sequence transformation, or shortly trend transformation. The concept of cost function of the transformation was given. Also discussed the Run-Length compression of resulting trends. Combining trend transformation cost and Run-Length representation of a trend, an index Information Description Cost (IDC) was given for choosing proper trend transformation.3) Definition of similarity between trends was proposed. Studied whole-matching problem of trends. Concept of trend indicator distribution was given, and based on this concept, an algorithm DistFil was proposed, which used trend indicator distribution to filter candidate trends. Time performance of the algorithm in experiments was satisfying under the presumption of small trend indicator set size, high similarity threshold and trends being of low frequency.4) Studied the problem of searching similar sub-trends. Based on the concept of trend indicator distribution, algorithms INDIC and VISL were given. The algorithms outperformed existing methods in experiments, with the same presumption as in 3).5) Proposed the problem of mining frequent sub-trends in trend sequences. An algorithm INAMFT was given. Time performance of the algorithm in experiments was satisfying.6) Studied the possibility of using trends in time-series classification. Comparisons between different approaches of time-series symbolization was made and discussed.7) Performed data mining on a real-world databases as an example. Above works was also applied and tested on the database.
Keywords/Search Tags:data mining, time-series, trend sequences, similarity-based sequence query
PDF Full Text Request
Related items