Font Size: a A A

Research On Time Series Discriminative Feature Mining And Classification Algorithm

Posted on:2023-01-14Degree:MasterType:Thesis
Country:ChinaCandidate:Z Z GaoFull Text:PDF
GTID:2558306845499184Subject:Computer Science and Technology
Abstract/Summary:
Time series widely exists in various fields such as science,medicine,finance,industry,etc.,and has become one of the most common and important data types.Therefore,the significance of time series data mining has become increasingly prominent.Time series classification is an important and challenging research problem in the field of time series data mining,which has received extensive attention in recent years.However,although many fruitful progresses have been made in the research of time series classification,it still faces the following two challenges.First,the increasing data scale puts forward higher requirements for algorithm efficiency.How to improve algorithm efficiency while ensuring classification accuracy is an urgent problem to be solved.Second,the scarcity of labeled data leads to the bottleneck of deep learning in the classification task.Self-supervised representation learning methods can alleviate this problem to a great extent,but this method is still lack of exploration in the research of time series classification.In order to deal with the above problems,this thesis conducts research on accurate and efficient time series classification algorithms and time series self-supervised representation learning methods.The main contributions are as follows:(1)Aiming at the problem of high time complexity of existing discriminative subsequence(Shapelet)extraction algorithms,a time series Shapelet extraction algorithm based on matrix profile is proposed.The proposed algorithm uses the matrix profile technique to quickly calculate the similarity vectors between subsequences of time series,and selects highly discriminative Shapelets of one category relative to another category through the point-to-point comparison between the homogeneous matrix profile and the heterogeneous matrix profile.The similarity between subsequences and the discrimination of subsequences are neatly and directly linked.Experiments show that the proposed algorithm improves the classification accuracy on multiple datasets and significantly improves the efficiency of shapelet extraction.(2)Aiming at the problems of low extraction efficiency,sensitivity to noise and insufficient expression ability of traditional Shapelets,a random Shapelet forest algorithm embedded with canonical time series features is proposed.First,the algorithm randomly selects Shapelets and limits the scope of the Shapelets to improve efficiency.Second,in order to improve the expressive ability of Shapelets and make up for the loss of accuracy caused by the random selection of Shapelets,a random Shapelet transform method embedded with canonical time series features is proposed.Experiments show that the proposed method has reached the current advanced level in classification accuracy and has significant advantages in efficiency.(3)In order to fully mine the effective information in unlabeled time series data and alleviate the heavy dependence of time series classification tasks on labeled data,a time series representation learning framework that integrates denoising auto-encoding and contrastive learning is proposed.The proposed method fuses the input reconstruction task of denoising autoencoders and the contrastive learning task based on TWIST loss into a unified framework.The two tasks complement each other’s strengths and jointly learn low-dimensional and highly linearly separable representations of time series.Experiments on three real-world datasets fully verify the effectiveness of the proposed method in the downstream linear classification task and semi-supervised classification task.To address the two key challenges currently faced by time series classification,three novel methods are proposed in this thesis,and the effectiveness of the proposed methods in dealing with the corresponding problems is verified by sufficient experiments.
Keywords/Search Tags:Time series, Classification, Shapelet, Self-supervised, Representation learning
Related items