Font Size: a A A

Clustering For Time Series Based On The Feature

Posted on:2015-05-20Degree:MasterType:Thesis
Country:ChinaCandidate:S L LiFull Text:PDF
GTID:2298330431957571Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
Time series data mining is an important branch of data mining,and its object of mining particular, namely the time series. Not the same as traditional static data, time series data is a complex object that describes the process of changing things.And the time sequence is ubiquitous, such as stock price data, the image data and text data can all be seen as a time series. As a branch of data mining, time series data mining studies have several major areas, including the time sequence similarity search, clustering of time series, time series classification, time series segmentation and pattern discovery, massive time-series visualization and time series prediction. This paper discusses a specific time series clustering, that is time series clustering based on the feature.Firstly, the paper makes a general introduction of time series mining, time series clustering and time series clustering based the feature. Secondly, making a general overview of the characteristics of time series and the time series similarity measure.And then based on the existing algorithms, presenting an improved fuzzy clustering algorithm newFCM (new fuzzy C-means) whice contains compressing data dimension based on key point extraction. And a dynamic clustering algorithm which named Dyn-Clustering based on the fuzzy clustering algorithm is presented. The algorithm newFCM to extract the peaks and valleys, or turning points of time series as the key points of the time series,thus constitutes a new key sequence to represent the original time series in order to achieve dimensionality reduction and noise removal purposes.As the algorithm is sensitive to singular value, then using the Lance Distance to overcome the shortcoming of the algorithm. And then adjusts the Lance Distance, making it more accurate to measure the similarity of our key sequences. Furthermore, introducing a similarity measure based on the basic statistical characteristics to overcome the algorithm can not be found the shortcoming of the translation or stretching of time series.The paper further introduces the clustering effect of metrics, and use these metrics to evaluate the proposed improved algorithm experiment from various perspectives. Firstly, newFCM algorithm for the parameters of evaluating experiments; Then, compares to the traditional FCM algorithm and K-means algorithm, a comparison experiment is carrying out; Finally, using the newFCM and Dyn-Clustering clustering prices series and trading volume series of20domestic commodity futures to do the analysis practical application.Experiments show that using newFCM algorithm for time series clustering, clustering effect is good, the efficiency is higher. In practice,to using the improved FCM clustering algorithm newFCM, when the classes are not very obvious of the time series, the algorithm can cluster time series into different categories based on different time periods, the results can help people further understand the dynamic evolution of the time series nature, and accurate grasp its structural features effectively. In particular, for financial time series, predicting category of time series dynamically, can grasp the dynamic trend of the time series of changes effectively in order to predict the time series.
Keywords/Search Tags:time series, clustering, key points, feature
PDF Full Text Request
Related items