Font Size: a A A

Attribute Associated Neuron Modeling And Missing Value Imputation Based On Neural Network

Posted on:2022-08-16Degree:MasterType:Thesis
Country:ChinaCandidate:J C ZhuFull Text:PDF
GTID:2518306509995259Subject:Software engineering
Abstract/Summary:PDF Full Text Request
With the development of subjects such as AI and Machine Learning,more and more highlights the importance of data.Individuals or enterprises will produce or aggregate a large quantity of data every day.Data capture is becoming more and more convenient,but the quality of data has gradually attracted people's attention.Among them,the problem of missing data often occurs,even difficult to avoid.Implementing algorithms or providing reliable decision analysis depends on high quality data,so data imputation has become an important research content.Firstly,this paper conducts regression modeling and imputes missing values based on Auto Associative Neural Network(AANN).Since the AANN can estimate missing values in multiple missingness patterns efficiently,we introduce incomplete records into the modeling process and propose an Attribute Cross Fitting Model(ACFM)based on AANN.ACFM reconstructs the path of data transmission between output and input neurons,and optimizes the model parameters by training errors of existing data,thereby improving its own ability to fit relations between attributes of incomplete data.Besides,for the problem of incomplete model input,this paper proposes a model training scheme(UMVDT),which sets missing values as variables and makes missing value variables update with model parameters iteratively.Generative Adversarial Networks can generate data that is distributed with training data,which has become a new hot spot in deep learning.The generator of GAN and AANN have the same characteristics,so this paper considers the transfer of ACFM and UMVDT to GAN,and proposes a GAN imputation method based on the above two strategies.Finally,for the problem of missing values in china family economic database,this paper uses a variety of methods to impute it,and then uses a clustering algorithm based on hierarchy and density to cluster and evaluate the imputed data.The experimental results indicate that the ACFM and UMVDT imputation method is most effective.The GAN imputation method based on the above two strategies is better than the ACFM-UMVDT method in nearly half of cases.Finally,when the clustering algorithm parameters are the same,the China family economic data set imputed by ACFM-UMVDT has the best clustering effect.
Keywords/Search Tags:Incomplete Data, ACFM, Imputation, Missing Value Variable, GAN
PDF Full Text Request
Related items