Font Size: a A A

Research On Self-supervised Learning Based Time Series Clustering

Posted on:2022-06-11Degree:MasterType:Thesis
Country:ChinaCandidate:Z Q QuFull Text:PDF
GTID:2518306569494714Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Clustering time series has been an important data mining problem for decades.It has been widely used in the fields of meteorological,biological,medical,industrial and so on.Traditional time series clustering methods used temporal transformation,representation and simplification to extract the feature of time series data,and finally used traditional clustering methods like K-means to obtain results.Due to the limitation on featrue extraction capabilities,traditional methods often obtain bad clustering results on some large time series data sets with complex features.In addition,the feature extraction module and the clustering module of traditional methods are usually separated,which makes it difficult for the model to optimize the two part at the same time,thus failing to achieve the best clustering effect.Self-supervised learning is a deep learning method which uses information from data itself to supervise the training process of the model.With the development of deep learning,self-supervised learning has been used in the representation learning of massive data.This study proposes a new time series clustering method called Self-supervised Representation learning based Time series Clustering(SRTC).An auto-encoder is utilized as basic framework for training by the self-representation information of time series data,which implements an end-to-end clustering model.In order to generate better clustering results,clustering loss function is added to the latent state of the model,which optimizes the clustering results steadily during the training process.Contrastive learning is a new self-supervised learning method,which achieves good performance on many unsupervised problems.This study proposes a novel time series clustering method called Self-supervised Contrastive learning based Time series Clustering(SCTC).A binary classifier is utilized as basic framework for training by the comparative information between different time series data,which also implements an end-to-end clustering model.In order to improve the stability of self-supervised training,Dynamic label generator and data augmentation method are designed to generate pseudo-labels and pseudo-data for training automaticily,which maximize the clustering capabilities of the model.Experiments are carried out on 16 typical benchmark datasets and 6 comparison methods to assess the proposed methods' accuracy.Results show that both methods perform well and SCTC provide excellent performance compared to state-of-the-art deep learning based clustering algorithms.Besides,this study presents ablation analysis on each part of the models and visualization of the clustering process that reveals interesting insights on the clustering results.
Keywords/Search Tags:time series clustering, self-supervised learning, representation learning, contrastive learning
PDF Full Text Request
Related items