Font Size: a A A

Research On Clustering Methods For Time Series

Posted on:2013-06-22Degree:MasterType:Thesis
Country:ChinaCandidate:Y LiFull Text:PDF
GTID:2268330395479605Subject:Computer system architecture
Abstract/Summary:PDF Full Text Request
Time series mining is one of important topics in data mining which has extensive application in various fields. As an important data, time series abundantly occur in economy, meteorology and so on. The potential change and development may be detected by analyzing these data. The investigation of these series can also provide a basis when making a decision. Following the rapid development of database technology and its widespread application, there are more and more data stored in many databases. How to analyze and process these data of time series, and at the same time find some previously unknown and valuable information, have attracted more and more researchers.It is difficult to obtain a desired result if one directly does the work, such as similarity query, classification, clustering, pattern recognition and etc on the raw time series. Doing so, it will not only lead to the low efficiency of computation and storage, but also have an impact on the accuracy and reliability of the algorithms, partial because of the time series data with the properties of massive volume, noise and short-term volatility frequently. Recently, the problem of time series mining, knowledge discovery, forecast, similarity searching for time series has become a hot topic in data mining. The main problems include dimensional reduction, feature extraction, similarity calculation, similarity searching, clustering and so on.Based on the existed works on time series clustering, this dissertation will discuss the following problems.1. A method of clustering symbolization time series based on DTW is proposed to cluster the unequal dimensional time series obtained by reduction. The key points of the time series are first extracted and symbolize them. Then the similarity between the two time series is calculated by DTW method. Last, the normal matrix and FCM algorithm are applied to cluster the time series. The experimental results show that the accuracy of cluster result obtained by the proposed method is better than the existed ones.2. Based on key point technology, a new method for time series cluster is proposed. The key points for each time series are first found, and then used the simple distance measure Euclidean distance. At last, clustering time series. The experimental results show that the dimensions of time series and the consumption of computing time can be effectively reduced by the proposal. Furthermore, the desired cluster result is obtained when applying this method to cluster some practical data.3. An improved symbolization time series method is introduced. Symbolic aggregate Approximation is an effective data discretization method which can reduce dimensionality of time series. After dimensionality reduction, the length of each time series is generally unequal. To extend the methods to the above case, a new algorithm is proposed. Firstly, the symbolic aggregate approximation method extracting key points technology is used to dimensional reduction. Secondly, time series with unequal length key points will be obtained. Then local equal length and symbolization will be computed by the proposal. At last, different similarity calculation methods are used in contrast. The experimental results show that this method is simple and effective, and which extends the methods of similarity calculation and clustering.
Keywords/Search Tags:time series, reduction dimension, feature extraction, clustering
PDF Full Text Request
Related items