Font Size: a A A

Research On Time Series Clustering Method Based On U-shapelets

Posted on:2019-11-27Degree:MasterType:Thesis
Country:ChinaCandidate:S Q YuFull Text:PDF
GTID:2428330566463249Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Time series is a hot research object in the field of data mining.How to find valuable information is the focus of time series data mining.And the time series clustering is one of the important means to extract valuable information.U-shapelets(unsupervised shapelets)are highly recognizable subsequences in time series datasets which derived from shapelets enable the accurate division of time series without marking.Its clustering result has strong interpretability and largely avoids the influence of noise.However,there are still problems that need to be solved in this method.This article has conducted in-depth research on these issues.First of all,we observe that the subsequence quality assessment method and the selection of u-shapelets in the existing u-shapelets-based time series clustering methods are not appropriate.Thus,we studied various subsequence quality assessment methods and finded I index is a proper method.On this basis,combine I index and diversified top-k query techniques,a new time-series clustering method(Div Ushap Cluster)is proposed.The experiments show that the overall effect of the Div Ushap Cluster is better than the traditional clustering methods;and compared with the Brute Force method and the SUSh method,Div Ushap Cluster can ensure that the running time on most data sets remains basically unchanged whilst the accuracy on the data sets was increased by 30% over the previous two.Secondly,the existing time series clustering method based on u-shapelets greatly increases the running time on large-scale data sets.In order to solve this problem,a fast time-series clustering method(Fast Ushap Cluster)is proposed which is based on the Div Ushap Cluster method.Fast Ushap Cluster uses turning points and R-tree index structure to reduce the magnitude of the u-shapelets subsequence candidate set which reduce time complexity effectively.The experiments show that: In terms of running time,the Fast Ushap Cluster method was accelerated 21 times and 71 times faster than the Div Ushap Cluster method and the SUSh method,respectively.Furthermore,the average clustering accuracy of the Fast Ushap Cluster method is 0.74201,which is slightly less than the 0.75477 of the Div Ushap Cluster method but much larger than the 0.58875 of the SUSh method.Finally,a time series clustering prototype system based on u-shapelets is designed and implemented on the basis of theoretical research.The effectiveness of the proposed research method is verified by means of visualization.The u-shapelets-based time series clustering prototype system uses modular development to improve the scalability of the system.For a more intuitive understanding of the time series clustering process based on u-shapelets,we provides the function of viewing time series,u-shapelets,and clustering effects of the Fast Ushap Cluster method and the SUSh method.
Keywords/Search Tags:time series, clustering, u-shapelets, diversified top-k query
PDF Full Text Request
Related items