Font Size: a A A

Study On The Interpolation Method Of Time Series Missing Value

Posted on:2019-11-12Degree:MasterType:Thesis
Country:ChinaCandidate:W W ChengFull Text:PDF
GTID:2428330545969567Subject:Control engineering
Abstract/Summary:PDF Full Text Request
With the rapid development of the information age,a large number of data are used in the machine learning and data mining.Most of the algorithms and related models are constructed for complete data sets.However,in the real world,the missing of data exists in the process of data collection,sorting,transmission and storage.Because of the data missing phenomenon,there are many difficulties in data analysis and application.The traditional methods of missing value processing is simple deletion,mean or zero value substitution.These methods will bring two serious problems:1)reduce the available data set,especially in the case of high missing rate.2)It is easy to introduce bias to the data set,and the way of mean substitution for zero substitution reduces the variance of data set and distorts the distribution feature of data set.In order to solve the related problems,this paper designs a missing value processing algorithm based on the theory of sparse representation and the K-nearest value of neighbor and proves the superiority of the proposed algorithm in this paper.The main work completed in this paper includes the following points:(1)A new missing value interpolation algorithm based on sparse recovery is proposed by using the sparse representation theory,and the relevant verification experiments of a PM2.5 time series data are designed.The superiority of the proposed algorithm is proved by the analysis of the experimental results of the various interpolation algorithms under different missing rates.The influence of different parameters on the interpolation algorithm of missing values is studied.(2)Based on the research of multivariable data,a new missing value interpolation algorithm based on sparse principal component(SPCA)analysis and gray relation coefficient nearest neighbors imputation algorithm(GKNNI)is proposed on the basis of the theory of sparse principal component analysis and grey relation coffiecient K-nearest neighbors algorithm.(3)Using the SPCA+GKNNI algorithm proposed in this paper,the interpolation experiments are designed for two kinds of multivariable data,and the interpolation results of different interpolation algorithms are compared.It is proved that the proposed correlation interpolation algorithm can deal with the problem of data loss with multivariable data well,and compares the traditional KNN interpolation algorithm and the SVD and BPAC algorithms.There is a certain improvement in the interpolation accuracy.
Keywords/Search Tags:Data Missing, Time Series, Sparse Representation, SPCA, GKNNI
PDF Full Text Request
Related items