Font Size: a A A

Research On Time Series Missing Value Imputation Algorithm Based On Dimension And Distribution Forecast

Posted on:2022-03-23Degree:MasterType:Thesis
Country:ChinaCandidate:H ChengFull Text:PDF
GTID:2530306737488774Subject:engineering
Abstract/Summary:PDF Full Text Request
Time series are used everywhere in the real-world,such as stock trend data,weather observation data and medical data.However,due to accidents,such as sensor damage or signal loss,values in time series are lost,making data difficult to use and damaging downstream applications,such as traditional classification,regression,prediction and other tasks.Therefore,it is very important to deal with missing values in time series data for subsequent analysis.Existing approaches try to deal with missing values by deletion,statistical imputation,machine learning based imputation.However,the temporal and dimensional relationships between observed values are rarely considered in these works,and time series are regarded as normal structured data,thus the information between time is lost,and the information between dimensions is not considered in the case of continuous missing.In view of the above problems,the following studies are carried out in the field of time series imputation:(1)This thesis analyzes the advantages and problems of the existing imputation models,and proposes to use RNN model to capture the time information according to the characteristics of time series data.At the same time,dimension dependence is considered,and missing values in the data are imputed together.(2)This thesis proposes a GRU based time regularization matrix decomposition model,GRU-RMF.Based on the TRMF model,the nonlinear gated cyclic network GRU is used as the regularizer of the time characteristic matrix to capture the long-term and short-term dependence between time series,the relationship between learning time,and the learning dimension relationship of matrix decomposition.The missing values of time series are imputed.(3)In this thesis,a recursive imputation network based on GAN is proposed.The method uses the Light GAIN(LGAIN)of the generative adversative network to learn the real distribution of the original data,imputes the missing values with the generated data,and calculates the error between the generated data and the original data to obtain the uncertain values.A new GRU unit(GRU-F)is designed to further impute the missing value by using the data,uncertain value and time information after LGAIN imputation.Experiments on multiple real-world datasets show that our model outperforms the baselines on the imputation accuracy and achieves state-of-the-art classification results on the downstream applications.
Keywords/Search Tags:Time Series, Data Imputation, Recurrent Networks, Matrix Decomposition, Generative Adversarial Networks
PDF Full Text Request
Related items