| Cultural heritages are the inheritance of culture and the crystallization of history.Cultural heritages protection is an important way to protect culture and historical materials.However,in the process of cultural heritages protection,due to the objective factors such as poor natural environment and physical equipment failure,cultural heritages sensing data faces the problems such as data loss,data inconsistency and noise,which brings huge challenges to data analysis and application in the upper layer.Especially when the sensing device is affected by the environment and its own performance loss,there will be a large number of missing values in the data,which further deteriorates the accuracy of data analysis.Therefore,it is of great significance to fill missing values in cultural heritages sensing data.The related work of missing value imputation schemes is analyzed and summarized in detail.According to the different imputation models used in different application scenarios,the missing value imputation models are classified into three categories: statistical method-based models,traditional machine learning-based models and neural network-based models.This paper surveys the related work comprehensively,summarizes the advantages,disadvantages,and existing challenges of the missing value imputation models,and expounds the research hotspots and trends in the future.Focusing on the defects of missing data in cultural heritages,such as few samples,strong correlation between various attributes,and noise interference,a missing value sequential imputation model based on Semi-supervised Generative Adversarial Networks(DSGAN-OD)is proposed.In this model,the multi-dimensional data are firstly de-noised and de-dimensional by Denoising Autoencoder(DAE).Due to the unsupervised attribute of Generation Adversarial Networks,the classification label information in the cultural heritage data cannot be fully utilized.The low-dimensional expression vectors obtained by DAE are utilized as learning samples of Semisupervised Generative Adversarial Networks(SemiGAN)to obtain features of missing datasets.Meanwhile,Order Decision(OD)method is adopted to determine the imputation order of missing values according to the correlation between attributes of data.Finally,according to OD,the missing values are interpolated with the complete data generated by SemiGAN to improve the accuracy of missing value imputation.Focusing on the strong spatiotemporal correlation between multi-source cultural heritages sensing data,a Categorical Cyclic Imputation Mechanism is proposed.Based on the DSGAN-OD model,the dataset is divided into multiple different classes through class labels,and the imputation order decision and the first missing value filling are carried out simultaneously in each class.After imputing a missing value,the imputation order decision is performed periodically,until the missing values in each category are filled.The effectiveness of the proposed method is verified on the UCI standard dataset and the temperature and humidity data of cultural heritages.Under different missing rates,the missing value imputation performance of DSGAN-OD model is compared with the existing GAN-based(GAIN)method,Random Forest(MissForest)method and Multiple Imputation by Chained Equations(MICE)method,in terms of Root Mean Squared Error(RMSE),Mean Absolute Error(MAE)and Area Under Curve(AUC).The experimental results show that the accuracy of the proposed missing value imputation model DSGAN-OD is improved by 21%,48.2% and 45.1%,respectively. |