Font Size: a A A

Research On The Similarity-Based Clustering Of Time Series

Posted on:2007-07-16Degree:MasterType:Thesis
Country:ChinaCandidate:L LvFull Text:PDF
GTID:2178360242961050Subject:Mechanical and electrical engineering
Abstract/Summary:PDF Full Text Request
Time series is an important class of data objects and can be easily obtained from scientific, medical and financial applications, e.g., daily temperatures, production outputs, weekly sale totals, and prices of mutual funds and stocks. For the fully use of these time series data and trying to find the knowledge from the large databases, this thesis devoted in the research on the similarity-based clustering of time series in details.Data mining, knowledge acquisition from time series heavily depends upon a suitable learning method, which is used to develop an optimal partitioning, i.e., clustering, of the data set to be analyzed. The k-mean, Fuzzy c-mean, Dynamic Time Warping and Self-Organizing Maps methods have been compared and the two limitations with the traditional SOM two limitations have been noted, which are related to the static architecture of this model as well as to the limited capabilities for the representation of hierarchical relations of the data.To overcome the existing problem, a novel technique based on a variant of SOM, namely Growing Hierarchical Self-Organizing Map (GHSOM), is used to perform clustering and pattern discovery from time series data set. While Random initialization of units at new sub-map usually will distort the global topology of neighboring maps, a coherent initialization method is proposed to provide a global orientation of the individual maps in the various layers of the hierarchy.Simulation and experiment results of the technique confirm that the method can form an adaptive architecture, which grows in size and depth during its training process, thus to unfold the hierarchical structure of the time series data, allowing us to analyze the data in an explorative way. Furthermore, the topological similarities of neighboring maps are preserved. It is effective and efficient, also scalable in clustering from large time series data sets.
Keywords/Search Tags:Time Series, Data mining, Clustering, Similarity, SOM, GHSOM
PDF Full Text Request
Related items