Font Size: a A A

A Fast Time Series Shapelet Discovery Algorithm Combining Selective Extraction And Subclass Clustering

Posted on:2020-11-09Degree:MasterType:Thesis
Country:ChinaCandidate:C ZhaoFull Text:PDF
GTID:2370330572484272Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Time series classification is one of the classical and hot issues in time series data mining.Its main content is to classify the time series of unknown class into known classes.Compared with the traditional classification problems,the attributes of time series data have sequential relations,while the traditional classification prob-lems do not take this aspect into consideration.Due to the high dimension and large data volume of time series data,the calculation cost of feature selection on time series data by traditional classification method is very large.Therefore,time series data classification is often considered separately from general classification problems.Ye and Keogh proposed a conception called shapelet in 2009.Shapelet is a continuous sub-sequence that can reflect the class information to the greatest extent in time series.It can well explain the classification result,that is,why a certain time series belongs to a certain class.The time series classification algorithm based on shapelet has the characteris-tics of interpretability,high classification accuracy and fast classification speed.Among these shapelet-based algorithms,learning shapelet algorithm which reach-es the highest accuracy does not rely on a single classifier,and shapelet that is not in the original time series can be learned.Meanwhile,shapelet discovery and classifier construction can be completed simultaneously.However,there are too many shapelet results,which lose the interpretability,reduce the classification speed,and rely on too many parameters,resulting in too long training time and difficult dynamic update.This paper makes an in-depth study of shapelet learning algorithm,with the purpose of maintaining high accuracy of learning shapelet algorithm and solving the two serious defects of it:low interpretability and long training time.In this work,we use a new selective extraction method to select the shapelet candidate set and change the learning method to speed up the shapelet learning process.In order to preserve the learning shapelet algorithm to learn the advantage of shapelet that does not exist in the original time series and solve the problem of too many shapelet,two optimization strategies were proposed.By using time series clustering for the original training set,shapelet not found in the original time series can be obtained.Meanwhile,a voting mechanism is added into the selective extraction algorithm to solve the problem of excessive shapelet generation.The main contributions of this paper are as follows:1.To solve the problem that the training time of learning shapelet algorithm is too long,a fast and interpretable algorithm is proposed to selectively extract shapelet candidate sets from time series sequences.The quality of shapelet candidate set generated by the algorithm is significantly improved,and the number of selected sets is greatly reduced at the same time.Based on these two characteristics,learning shapelet from the candidate set will be faster in the end.2.Two optimization strategies are proposed to solve the problem that the similarity of the generated shapelet set will reduce the explanatory and classification speed.Firstly,subclass clustering is used for the training set,so that the final shapelet is not restricted to the original training set,and at the same time,the time series in the input selective extraction algorithm are more differentiated,which is conducive to the extraction of candidate shapelet.Then,by adding the voting mechanism into the selective extraction algorithm,the number of votes for each sub-sequence was counted,and the overlapping sub-sequences were removed,which greatly reduced the number of shapelet repeats.
Keywords/Search Tags:Time Series, Classification, Shapelet, Candidates, Selective Extrac-tion
PDF Full Text Request
Related items