Font Size: a A A

Algorithm Study On Short Time Series Mining

Posted on:2005-07-03Degree:DoctorType:Dissertation
Country:ChinaCandidate:K D LuoFull Text:PDF
GTID:1118360152968068Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Short time series occur frequently in e-business, macroeconomics, scienti?c re-search, public management and so forth. The basic characteristic of short time seriesdata is that each piece of series contains relatively few observations. So the existing timeseries mining algorithms are hard to be applied directly on short time series. The thesisanalyzes two typical types of short time series and introduces effective mining algorithms,according to the characters of data types and the requirement of relevant applications. Themain contributions are summarized as follows: ? Research on mining association rules from cross-sectional short time series The cross-sectional short time series occur frequently in the statistical data of government, large-scale enterprise and organization. The task of time series mining can be divided into two steps of discritization and rule mining. The thesis introduces a novel discritization algorithm based on hypothesis test, and improves Apriori al- gorithm. The mining algorithm is applied on the analysis of macroeconomic data of China. ? Research on clustering inequi-interval short time series Inequi-interval short time series occur frequently in e-business, e-government and so on. The thesis introduces the clustering algorithm of Soft-Assignment Gaus- sian Mixture Models, which can cluster accurately inequi-interval short time series and scale linearly with the size of dataset. So it can be used on large-scale dataset. The mining algorithm is applied on the analysis of dialup users from China Tele- com. ? Using clustering/predicting method to improve dialup billing processing Considering the performance issue in the real-life dialup billing system, the thesis introduces a performance improvement solution based on selective caching. The method of clustering/predicting is the basis of selective caching. Based on Soft-Assignment Gaussian Mixture Models, the thesis introduces Mixture Binomial Expanding Models to do clustering and prediction, which can cluster much faster – III –Abstract and predict accurately. I use simulated experiments adopting real-life dialup data to prove the effectiveness of performance improvement solution. Based on the algorithm research, I design and develop a prototype for short timeseries mining. The prototype provides friendly GUI and open program interface.
Keywords/Search Tags:Short Time Series, Data Mining, Clustering, Association Rule, Time Series Discretization, Transaction Processing
PDF Full Text Request
Related items