Font Size: a A A

Research On Time Series Classification Algorithm Based On Shapelet Learning And Transformation

Posted on:2022-10-12Degree:MasterType:Thesis
Country:ChinaCandidate:C ZhaoFull Text:PDF
GTID:2518306521469124Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
With the development of society,data types are becoming more and more diversified.Time series data is a series of continuous real values generated with the change of time,which usually has a relationship in time.Its characteristics are large amount of data,high dimension of data and updating with the change of time.Classification of time series data is an important research content in the field of data mining.Considering that time series data usually shows differences between categories in a sub-sequence,shapelet as a sub-sequence with high discrimination has attracted attention.The time series classification algorithm based on shapelet has the characteristics of interpretability,fast classification speed and high classification accuracy.The current problems that still need to be improved are the extraction of shapelet consumes a lot of time and the classification accuracy of multi-class time series data is not ideal.Therefore,the main work and innovation of this paper are as follows:(1)To solve the problem that extracting shapelet directly from original time series data sets consumes a lot of time,an improved shapelet learning algorithm based on similarity measure is proposed.First,the local sensitive hash function and the dynamic time warping distance are combined to generate an improved similarity measure method,and the data set is preprocessed to remove a large number of similar sequences to avoid the generation of similar shapelets;then the shapelets with high discrimination are learned through the learning function;finally use the resulting final shapelet set to classify the time series data.15 data sets in UCR are selected for experiments,and compared with a variety of time series classification algorithms,the classification accuracy of 12 data sets is in the lead.In the experimental time consumption comparison,15 datasets are ahead of the other three shapelet based time series classification algorithms.(2)The existing quality evaluation standards of shapelet are insufficient to extract the shapelet representing a class of time series from multi class data,which leads to the low classification accuracy of multi class data.An improved algorithm of shapelet extraction is proposed.The shapelet extracted by the improved algorithm is named single-class shapelet.It evaluates the quality by distinguishing its source time series class from all other classes in the data set,and proposes a length parameter estimation algorithm to determine the length range of the shapelet.Finally,the extracted single class shapelet is used to transform the time series data into ordinary data,and then the 1nn algorithm is used to classify the data.Select 9 multi-class time series data in the UCR data set for experimental verification.After comparing with a variety of time series classification algorithms,the classification accuracy of the algorithm proposed in this paper is in the leading position.Compared with the traditional shapelet transform data algorithm,the classification accuracy of seven data sets is in the leading position.
Keywords/Search Tags:Time series classification, shapelet, similarity measurement method, shapelet transform
PDF Full Text Request
Related items