Font Size: a A A

Similarity Search On Time Series Based On Clustering Bit Sequences With Changing Mode

Posted on:2008-05-10Degree:MasterType:Thesis
Country:ChinaCandidate:X P LiFull Text:PDF
GTID:2178360272969876Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
As one of the important data types, time series has wide application in various areas, including business, medical, engineering, science etc. The time series databases from real life applications record a large amount of information. Currently it is of great urgency to propose efficient yet effective approaches to handle time series, mine the inherent relations between data and search the similarity. The problem of time series similarity searching has attracted increasing research interests due to its importance.One important part in time series similarity searching is to find the sequences with similar changing mode; however, previous work has some disadvantages. A new searching approach based on clustering the changing mode of bit sequences is proposed. The method models the changing mode of time series into a fixed-length bit sequence, and adopts the distance between the bit sequences as the similarity measure. To avoid the time-consuming sequential scan on the whole time series database, the method first clusters the sequences via the similar changing mode, and builds a B+ tree on the clusters. The query proceeds into two steps, the first one is searching in the cluster index based space, and finding the candidate set of sequences with similar changing mode; the second step is exact searching in the candidate set, and emitting out those are unqualified. The method ensures the searching efficiency and effectiveness.As to the empirical experiments, datasets are randomly produced, the dimension ranges from 20 to 60, and the size varies from 10,000 to 600,000. During the experiments, all the searching methods run in the same datasets. The results show that the method gains better efficiency and better results.
Keywords/Search Tags:data mining, time series, cluster analysis, similarity search
PDF Full Text Request
Related items