Font Size: a A A

Research And Implementation Of A Hankel-Matrix-Factorization-based Technology For Recovering Missing Values In Tagged Time Series

Posted on:2021-02-15Degree:MasterType:Thesis
Country:ChinaCandidate:S M WuFull Text:PDF
GTID:2370330647450755Subject:Computer technology
Abstract/Summary:PDF Full Text Request
Time series data is now very common in our daily life,utilizing artificial intelligence especially machine learning to mine effective knowledge has a very broad application prospect,and it has became a hot topic in both academic and industrial fields.However,various inevitable factors in real-world system cause the missing of time series data,which seriously affects the accuracy and efficiency of time series data mining.To solve this problem,the research on missing value imputation for time series has attracted more and more attention from academia and industry.Among the problem of time series data missing,blackouts which described as losing all the data during a certain period,is an common but most challenging issue.The absence of any other coevolving data sequences for reference during blackouts greatly increases the difficulty of data recovery.As a resultg many existing approaches that rely on data from other coevolving sequences for missing value imputation are infeasible in this scenario.Regarding the above described situation,combined with the observation of the real-world data sets,we believe that the time series data generated when measuring the corresponding target,patterns of the target's self-evolution should be demonstrated as well as the impacts of external events.Thus,the external events can be symbolized in the form of tags when recovering the missing values,and these information can then be properly utilized to facilitate the process of data recovery.Following this hypothesis,a novel time series modeling method was proposed,and based on which an imputation algorithm HKMF-T for tagged time series was established,Moreover,a series of algorithms including HKMF-T was packaged to construct a data recovery tool based on the time series database InfluxDB.In summary,this paper makes the following contributions?We propose a method for modeling time series data with external tag information.The model integrates the internal change characteristics of time series data(i.e.,evolutionary trend)and the external impact characteristics indicated by tags(i.e.,external impacts).By adding some constrains for the corresponding characteristics,we establish the theoretical foundation for late-stage missing value imputation.?Based on the above time series modeling,this work proposes a novel Hankel matrix factorization approach,HKMF-T,to recover missing values for time series.HKMF-T utilizes the partially observed data and tag information to learn the evolutionary trend and external impacts,thus recovering the missing values.Extensive experiments are conducted to evaluate the practical performance of HKMF-T on real-world data sets.And the results suggest HKMF-T outperforms the baseline approaches by achieving higher accuracy for data recovery.?We design and implement a time series database InfluxDB-based tool for recovering missing values in time series,a few effective algorithms including HKMF-T are packaged in the tool.The tool shows great recovery performance for tagged time series and has a decent practicality.
Keywords/Search Tags:Tagged Time Series, Missing Value Imputation, Blackouts, Hankel Matrix Factorization
PDF Full Text Request
Related items