Researches On Imputation And Classification Of Incomplete Data Based On Variables For Missing Values

Posted on:2021-01-09

Degree:Master

Type:Thesis

Country:China

Candidate:X Wu

Full Text:PDF

GTID:2428330611451406

Subject:Software engineering

Abstract/Summary:

PDF Full Text Request

The explosive growth of data brings unprecedented opportunities and challenges for human society,and the effective way to mine the potential value implied in data has become a crucial topic.Classification,as a common way of data analysis,can provide detailed insight and induction into the inherent law of data.However,real-world datasets are susceptible to missing values,which increases the difficulty of data mining and lowers the reliability of inference results.Under this background,the paper involves a progressive two-stage work on missing value imputation and incomplete data classification,which can be specified as follows.(1)For the problem of missing value imputation,we propose a tracking-removed autoencoder by redesigning the input structure of hidden neurons in a dynamic way based on the autoencoder.Moreover,a scheme that treats missing values as variables and allows them to participate in network training is designed considering the data incompleteness.The imputation is completed at the end of the training process.The proposed method makes full use of present values in the incomplete dataset and builds the correlation of attributes by the tracking-removed autoencoder for the effective imputation in complicated missing patterns.Experiments validate that the proposed method has the ideal performance of imputation.(2)For the problem of incomplete data classification,we first build a regression model by the tracking-removed autoencoder to mine the attribute interdependencies within the data,then reorganize neurons in the output layer and construct a multi-task learning model to achieve imputation and classification simultaneously.In model training and prediction periods,missing values are treated as variables and updated dynamically accompanying with model parameters considering the incomplete model input.The dynamic optimization of missing values promotes the model to match the regression and classification structures implied in incomplete data.The experiments on UCI data sets validate the effectiveness of the proposed method.This paper makes an in-depth discussion on the incomplete data from the above two aspects,and thus proposes effective solutions.In the era of big data where data quality is difficult to be guaranteed,the research involves in the paper has important practical significance.

Keywords/Search Tags:

Incomplete Data, Imputation of Missing Values, Classification, Tracking-removed Autoencoder, Coupling Modeling

PDF Full Text Request

Related items

1	Incomplete Data Modeling And Missing Value Imputation Based On Confidence
2	Attribute Correlation Modeling And Missing Value Imputation Of Incomplete Data Based On Fuzzy Partition
3	Attribute Associated Neuron Modeling And Missing Value Imputation Based On Neural Network
4	Research On Missing Value Imputation Method Based On Mixed Information System
5	Imbalanced-type Incomplete Data And Missing Value Imputations Based On TS Modeling
6	Research On Missing Value Imputation Of Incomplete Data
7	Deep Learning Research For Modeling Incomplete Time Series
8	Modeling Of Incomplete Data And Missing Values Imputations Based On Alternate Learning
9	Incomplete Data Ensemble Classification Using Imputation- Revision Framework With Local Neighborhood Information
10	Missing Value Imputation Based On TS Modeling With Alternate Learning