Font Size: a A A

Research On Time Series Classification Method Based On Combination Shapelets

Posted on:2020-03-14Degree:MasterType:Thesis
Country:ChinaCandidate:Y ZongFull Text:PDF
GTID:2370330590952370Subject:Computer technology
Abstract/Summary:PDF Full Text Request
Time series is a kind of data that is arranged in chronological order and has the characteristics of “massive”,“high-dimensional” and “continuously updated”.Due to the extensive application of time series,the analysis of time series has caused a lot of scholars.attention.At present,time series analysis has an important influence in the fields of data mining and machine learning,and has been rated as one of the top ten most challenging problems in the field of data mining in the 21 st century.As a hot issue in time series research,time series classification has a wide range of application scenarios.For example,in the field of manufacturing,time series classification results can be used to realize abnormality detection of machine operation,and in the medical field,recognition of cardiac diseases can be realized by means of classification results of electrocardiogram.However,because the time series has the characteristics of “massive”,“high dimensionality” and “continuous updating”,the traditional data mining methods have great limitations in the classification of time series.Aiming at the problems existing in existing time series processing methods,this paper intends to study the two main problems in time series analysis: feature extraction and representation of time series and time series classification.The main work and innovations of this paper include:1.For the large amount of time series data and high dimension,this paper uses the trend segmentation point of time series and calculates the segmentation fitting error to realize the feature extraction and piecewise linear representation for time series.Experiments show that the proposed method can preserve the temporal and global trend characteristics of time series while effectively reducing the time series dimension,and lay a foundation for improving the efficiency of time series classification.2.The shapelet is a subsequence with high discrimination in the time series.The shapelet can distinguish the time series of different categories well,and the classification result is interpretable.In view of the current low efficiency of finding shapelets,this paper uses the time series important segmentation points to filter the time series subsequences,and calculates and evaluates the information gain of the filtered subsequences.The experimental results show that the use of important segmentation points to filter subsequences can effectively improve the efficiency of shapelet discovery and ensure the accuracy of time series classification.For the case that the classic shapelet lacks expressiveness in some classification problems,this paper combines multiple shapelets to realize the classification of time series.This method inherits and extends the fast shapelet discovery algorithm proposed in(2).After the time series subsequence is evaluated,the candidate shapelets are combined using a combination algorithm.Experiments show that this method effectively improves the distinguishing ability and classification accuracy of shapelets by combining shapelets.
Keywords/Search Tags:time series classification, time series feature representation, shapelet, interpretability
PDF Full Text Request
Related items