Font Size: a A A

Research On Time Series Classification Algorithm And Application Based On Shape Features

Posted on:2021-03-22Degree:MasterType:Thesis
Country:ChinaCandidate:Q H LinFull Text:PDF
GTID:2370330614471362Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Time series is a set of real-valued data with a sequential relationship,which exists widely in various fields in daily life.This type of data is usually characterized by large data size,rich local shape features,and high dimensions.With the development of sensor devices,more and more time series data can be used for scientific research and in-depth data mining.In the past few decades,a large number of researchers have also conducted extensive and in-depth research on time series data mining.Time series classification is one of the important tasks of time series data mining.Although many effective achievements have been made in the time series classification,there are still some problems to be solved.Firstly,there are abundant shape features in the time series.How can we overcome the influence of noise to express the shape features of the sequence? Secondly,in the time similarity measurement,how can we effectively measure the local shape similarity of sequences? Finally,how to mine effective patterns for classification quickly and effectively when mining time series.Based on the above three questions,this paper will study how to establish an interpretable time series classification algorithm based on the shape feature of the sequence.The main contributions are as follows:(1)Aiming at the existing time series similarity measurement method which only considers the distance function between point pairs and does not have interpretability,an interpretable time based on Dynamic Time Warpping(DTW)is proposed Sequence similarity measurement method.This method combines the local shape similarity measure and the distance measure of the time series,and can simultaneously consider the numerical similarity and local shape similarity of the sequence.In addition,the filtering strategy and discretization strategy in this paper can effectively overcome the influence of noise on the local shape feature representation;(2)Aiming at the existing time series representation method,which usually extracts the statistical features such as the mean and variance of the subsequence,while ignoring the local shape features of the time series,a time series representation method based on the trend and statistical feature bag model is proposed.The algorithm uses the discretized slope to represent the local trend features of the sequence,and uses the weighted histogram to encode the trend features,which can effectively represent the local shape features of the sequence on the basis of overcoming the noise.In addition,the method also incorporates statistical features such as the mean and variance of subsequences,which can more completely represent the original sequence;(3)Aiming at the problem that the existing pattern mining methods cannot well balance the time complexity and classification accuracy,a dictionary-based multi-scale multi-domain time series pattern mining method is proposed.This method can greatly reduce the number of candidate modes while retaining the original sequence information.In addition,this paper uses the method of single factor analysis of variance to evaluate candidate models,and uses F statistics to select models,which can ensure the effectiveness of the models.Experimental results show that the method proposed in this paper can effectively improve the classification accuracy of time series data.The specific example analysis also reflects the interpretability of the algorithm in this paper.
Keywords/Search Tags:Similarity Measure, Local Shape Feature, Sequence Representation, Pattern Minning, Time Series Classification
PDF Full Text Request
Related items